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INTRODUCTION 


The education of the mathematics major begins with the study 
of three basic disciplines: mathematical analysis, analytic geo- 
metry and higher algebra. These disciplines have a number of points 
of contact, some of which overlap; together they constitute the 
foundation upon which rests the whole edifice of modern mathema- 
Higher algebra— the subject of this text— is a far-reaching 
and natural generalization of the basic school course of elemeiitar> 
algebra Central to elementary algebra is without doubt the problem 
of solving equations. The study of equations begins with the very 
simple case of one equation of the first degree in one unknown. 
From there on, the development proceeds in two directions: to 
systems of two and three equations of the first degree m two and. 
respectively, three unknowns, and to a single quadratic equation 
in one unknown and also to a few special types of higher-degree 
equations which readily reduce to quadratic equations (quartic 

equations, for example). , . , tu- i ^ \ 

Both trends are further developed in the course of higher algebra. 

thus determining its two large areas of study Oue-lhe foundations 
of linear algebra -starts with the study of .ybit rary systems of 
equations of the first degree (linear equations). When the number 
of equations equals the number of unknowns soluUons of such 
systems are obtained by means of the theory of determinants Howe- 
ver the theory proves insufficient when studying systems of linear 
equations in which the number of equations is not equal to the 
number of unknowns. This is a novel feature from the standpoint 
of elementary algebra, but it is very important in practical appli- 
cations. This stimulated the development of the theory of matrices, 
which are systems of numbers arranged in square or rectangular 
arrays made up of rows and columns. Matrix theory proved to be 
very deep and has found application far beyond the liinils of the 
Ihcorv of .systems of linear equations. On the other hand, investiga- 
tions into systems of linear equations gave rise to multidimensional 
(so-called vector or linear) spaces. To the nonmalliematician, mul- 
tidimensional space (four-dimensional, to begin with) is a nebulous 
and often confusing concept. Actually, however, the notion is 
a strictly mathematical one, mainly algebraic, and serves as an 
important tool in a variety of mathematical investigations and 

ahso in physics. , , , , ,, , , , , 

The second half of the course of higher algebra, called the algebra 

of polynomials, is devoted to the .study of a single equation in one 

unknown hut of arbitrary degree. Since there is a formula for solving 

quadratic equations, it was natural to seek similar formulas for 
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higher-degree equations. That is precisely how this division of 
algebra developed historically. Formulas for solving equations 
of third and fourth degree were found in the sixteenth century. 
The search was then on for formulas capable of expressing the roots 
of equations of fifth and higher degree in terms of the coefficients 
of the equations by means of radicals, even radicals within radicals. 
It was futile, though it continued up to the beginning of the nine- 
teenth century, when it was proved that no such formulas exist 
ami that for all degrees beyond the fourth there even exist specific 
examples of equations with integral coefficients whose roots cannot 
be written down by means of radicals. 

One .should not be saddened by this absence of formulas for 
solving equations of higher degrees, for even in the case of third 
and fourth degree equations, where such formulas exist, computa- 
tions are extremely involved and, in a practical sense, almost useless. 
On the other hand, the coefficients of equations one encounters in 
l)hysics and engineering are usually quantities obtained in measu- 


rements. 'Hieso are approximations and therefore the roots need 
only be known approximately, to within a .specified accuracy. This 
lo<l to the elaboration of a variety of methods of approximate solu- 
tion of equations; only the most elementary methods are given 
in the course of higher algebra. 

However, in the algebra of polynomials the main thing is not 
the problem of finding tlie roots of equations, hut the problem of 
their existence, bnr example, we even know of quadratic equations 
with real coeflicients that do not have real-valued roots. I3y extending 
the range of numbers to include the coHeclion of complex numbers, 
we timi that quadratic equations do have roots and that this holds 
true for equalioTis of the third and fourth degree as well, as follows 
from tlie existence of formulas for their solution. But perhaps there 
are equations of the fifth and higher degree without a single root 
e\en in the class of complex numbers. Will it not be necessary, 
wlien seeking the roots of such equations, to pass from complex 
number.'^ to a .still i)iggcr class of numbers? The answer to thisques- 
lioti is contained in an important theorem wliich asserts that any 
equation with numerical coefficients, whether real or complex, has 
I omplex-valued (real-valued, as a special case) mots; and, generally 
speaking, tlie number of roots is equal to the ilegrce of the equation. 

Such, in brief, is tlie ha.^ic content of the course of higher algebra. 
It must be .^tressed (iiat higher algebra is only the starting point of 
the va.sl sdoiico of algebra which is very rich, extremely ramified 
and conslanlly ex|)anding. Let us attempt, even more ‘sketchily, 
to survey the various branches of algebra which, in the main, lie 
bi’xond llie scope of the course of higher algebra. 

kiiiear algebra, which is a broad field devoted mainly to the 
tht'ory of matrices and the associated theory of linear transforma- 
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tions of vector spaces, includes also the theory of forms, the theory 
of invariants and tensor algebra, which plays an important role 
in differential geometry. The theory of vector spaces is further 
developed outside the scope of algebra, in functional analysis 
(infinite-dimensional spaces). Linear algebra continues, so far, 
to occupy first place among the numerous branches of algebra as to 
diversity and significance of its applications in mathematics, physics 
and the engineering sciences. 

The algebra of polynomials, which over many decades has 
been growing as a science concerned with one equation of arbitrary 
degree in one unknown, has now in the main completed its develop- 
ment. It was further developed in part in certain divisions of the 
theory of functions of a complex variable, but basically grew into 
the theory of fields, which we will .speak of later on. Now the very 
difficult problem of systems of equations of arbitrary degree (not 
linear) in several unknowns— it embraces both divisions of the 
course of higher algebra and is hardly touched on in this text— actual- 
ly has to do with a special branch of mathematics called algebraic 
geometry. 

An exhaustive treatment of the problem of the conditions under 
which an equation can be solved in terms of radicals was given 
by the French mathematician Galois (1811-1832). His investiga- 
tions pointed out new vistas in the development of algebra and led, 
in the twentieth century, after the work of the German woman- 
algebraist E. Noether (1882-1935), to the establishment of a fre.'sh 
viewpoint on the problems of algebraic science. There is no doubt 
now that the central problem of algebra is not the study of equa- 
tions. The true subject of algebraic study is algebraic operations, 
like those of addition and multiplication of numbers, but possibly 
involving entities other than numbers. 

In school physics one deals with the operation of composition 
of forces. The mathematical disciplines studied in the junior courses 
of universities and teachers’ colleges provide numerous e.xamples 
of algebraic operations: the addition and multiplication of matrices 
and functions, operations involving vectors, transformations of 
space, etc. These operations are usually similar to those involving 
numbers and bear the same names, but occasionally some of the 
properties which are customary in the case of numbers are lost. 
Thus, very often and in very important instances, the operations 
prove to be noncomrnutative (a product is dependent on the order 
of the factors), at times even nonassociative (a product of three 
factors depends on the placing of parentheses). 

A very systematic study has been made of a few of the most 
important types of algebraic systems (or structures), that is, sets 
composed of entities of a certain nature for which certain algebraic 
operations have been defined. Such, for example, are fields. These 
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are algebraic systems in which (like in the systems of real and com- 
plex numbers) are defined the operations of addition and multipli- 
cation, both commutative and associative, connected by the distri- 
butive law (the ordinary rule of removing brackets holds) and pos- 
sessing the inverse operations of subtraction and division. The theory 
of fields was a natural area for the further development of the theory 
of equations, while its principal branches-the theory of fields of 
algebraic numbers and the theory of fields of algebraic functions 
linked it up with the theory of numbers and the theory of functions 
of a complex variable, respectively. The present course of higher 
algebra includes an elementary introduction to the theory of fields, 
and some portions of the course— polynomials in several unknowns, 
the normal form of a matrix— are presented directly for the case 
of an arbitrary base field. 

Broader than a field is the concept of a ring. Unlike the field, 
division is not required here and, besides, multiplication may be 
noncoramutative and even nonassociative. The simplest instances 
of rings are the set of all integers (including negative numbers), 
the set of polynomials in one unknown and the set of real-valued 
functions of a' real variable. The theory of rings includes such old 
branches of algebra as the theory of hypercomplex numbers and 
tlie theory of ideals. It is related to a number of mathematical 
sciences (functional analysis being one) and has already made 
inroads into physics. The course of higher algebra actually contains 
only the definition of a ring. 

Still greater in its range of applications is the theory of groups. 
.\ group is an algebraic system with one basic operation, which 
must he associative hut not necessarily commutative, and must 
possess an inverse operation (division if the basic operation is mul- 
tiplication). Such, for example, is tlie set of integers with respect 
to the operation of addition and also the set of positive real num- 
bers with respect to the operation of multiplication. Groups were 
already important in the theory of Galois, in the problem of the 
solvability of equations in tenii.s of radicals; today groups are a power- 
ful tool in the theory of fields, in many divisions of geometry, in 
topology, and also outside mathematics (in crystallography and 
theoretiral physics). Generally speaking, within the sphere of 
algebra, group theory takes second place after linear algebra as to 
its range of applications. Our course of higher algebra contains 
a chapter on tlio fundamentals of the theory of groups. 

In recent decades an entirely new branch of algebra— lattice 
theory— has come to the fore. A lattice is an algebraic system with 
two ojierations— addition and multiplication. These operations 
imisl be commutative and associative and must also satisfy the 
following requirements: both the sum and the product of an element 
with itself must be equal to the element; if the sum of two elements 
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is equal to one of them, then the product is equal to the other, and 
conversely. An example of a lattice is the system of natural num- 
bers relative to the operations of taking the least common multiple 
and the greatest common divisor. Lattice theory has interesting 
ties with the theory of groups and the theory of rings, and also 
with the theory of sets; one old branch of geometry (projective 
geometry) actually proved to be a part of the theory of lattice. 
It is also worth mentioning the expansion of lattice theory into 
the theory of electric circuits. 

Certain similarities between parts of the theories of groups, 
rings and lattices led to the development of a general theory of 
algebraic systems (or universal algebras). The theory has only taken 
a few steps but its general outlines are evident and certain links 
with mathematical logic tliat have been perceived point to a rich 

future in this area. . i « 

Th© forsgoing schcrnc docs not of cours© GinbrflCG to© whol© 

range of algebraic science. For one thing, there are a number of 
divisions of algebra bordering on other areas of mathematics, such 
as topological algebra, which deals with algebraic systems in which 
the operations are continuous relative to some convergence defined 
for the elements of the .systems. An example is the system of real 
numbers. Closely related to topological algebra is the theory of 
continuous (or Lie) groups, which has found numerous applica- 
tions in a broad range of geometrical problems, in theoretical physics 
and hydrodynamics. Incidentally, the theory of Lie groups is chara- 
cterized by such an interweaving of algebraic, topological, geome- 
tric and function-theoretic methods as to be more properly conside- 
red a special branch of malhomalics altogether. Next we have the 
theory of ordered algebraic systems which arose out of investigations 
into the fundamentals of geometry and has found applications 
in functional analysis. Finally, there is differential algebra which 
has established fresh relationships between algebra and the theory 
of differential equations. 

Quite naturally, the flowering of algebraic science so evident 
today is not accidental, but is an organic part of the general advance 
of mathematics and is due, in large measure, to the demands made 
upon algebra by the other matiiematical sciences. On the other hand, 
the development of algebra itself has exerted a far-reaching influence 
on the elaboration of allied branches of science; this influence has 
been particularly enhanced by the .spread of applications so cltara- 
cteristic of modern algebra. One is often tempted to speak of an 
“algebraizalion” of mathematics. 

We conclude this rather sketchy survey of algebra with a gene- 
ral historical background. 

Babylonian and, later, ancient Greek mathematicians studied 
certain problems of algebra, in particular the solution of simple 
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equations. The peak of algebraic investigations during tlus period 
was reached in the works of the Greek mathematician Diophantos- 
of Alexandria (third century). These studies were then extended by 
mathematicians of India: Aryabhata (sixth century), Brahmagupta 
(seventh century), and Bhaskara (twelfth century). In China, alge- 
braic problems got an early start: Ch’ang Ts’ang (second century 
B.C.), Ching Chou-chan (first century A.D.). An outstanding Chinese 
algebraist was Ch’in Chiu-sliao (thirteenth century). 

A major contribution to the development of algebra was made 
by scholars of the Middle East whose writings were in Arabic, par- 
ticularly the Uzbek scholar Muhammad al-Khowarizmi (ninth cen- 
tury) and the Tajik mathematician and poet Omar Khayyam (1040- 
1123). In particular, the very term “algebra” came from the title 
of al-Khowarizmi’s treatise Hisdb al-jabr w'al-muqd-balah. 

The above-mentioned studies of Babylonian, Greek, Indian,. 
Chinese, and Central-Asian algebraists have to do with those pro- 
blems of algebra which constitute the present school course of ele- 
mentary algebra and only occasionally touch on equations of the 
third degree. That, in the main, was the range of problems that 
interested medieval European algebraists and those of the Renais- 
sance, such as the Italian mathematician Leonardo of Pisa (Fibo- 
nacci) (twelfth century) and the founder of present-day algebraic 
.symbolism, the Frenchman Vieta (or Viete) (1540-1603). We have 
already mentioned that in the sixteenth century methods were 
found for solving equations of the third and fourth degree; here 
we must mention the names of the Italians Ferro (1465-1526), Tar- 
taglia (1500-1557), Cardano (1501-1576) and Ferrari (1522-1565). 

The seventeenth and eighteenth centuries saw an intensive ela- 
boration of the general theory of equations (or the algebra of poly- 
nomials) in which outstanding scholars of the time participated: 
Descartes (159fM650), Sir Isaac Newton (1643-1727), d’Alembert 
(1717-1783) and Lagrange (1736-1813). In the eighteenth century, 
Ihe^^Swiss mathematician Cramer (1704-1752) and Laplace (1749- 
1827) of France, laid the foundation of the theory of determinants. 

turn of the century, the great German mathematician Gauss 
(1777-1855) proved the earlier mentioned fundamental theorem on 
the existence of roots of equations with numerical coefficients. 

i lie first third of the nineteenth century stands out in the history 
of algebra as the time when the problem of the solvability of equa- 
tions by radicals was resolved. Proof of the impossibility of obtain- 
ing formulas for the solution of equations of degree five or higher was 
obtained by the Italian mathematician Ruffini (1765-1822) and in 
more rigorous form by the Norwegian Abel (1802-1829). As already 
mentioned, an exhaustive treatment of the problem of the conditions 
under which an equation admits of solution in terms of radicals 
was given by Galois. 
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Galois’ theory spurred the advance of algebra in the latter half 
of the nineteenth century. There appeared the theory of fields of 
algebraic numbers and of fields of algebraic functions and the asso- 
ciated theory of ideals. Here, mention should be made of the German 
mathematicians Kummer (1810-1893), Kronecker (1823-1891), and 
Dedekind (1831-1916), and the Russian mathematicians E. I. Zolo- 
tarev (1847-1878) and G. F. Voronoi (1868-1908). Particular advances 
were made in the theory of finite groups which grew out of the research 
of Lagrange and Galois; this work was carried out by the French 
mathematicians Cauchy (1789-1857) and Jordan (1838-1922). the 
Norwegian Sylow (1832-1918), the German algebraists Frobenius 
(1849-1918) and Holder (1859-1937). The investigations of the Nor- 
wegian S. Lie (1842-1899) initiated the theory of continuous groups. 

The works of Hamilton (1805-1865) and the German mathemati- 
cian Grassmann (1809-1877) laid the foundations for the theory 
of hypercomplex systems or, as we now say, the theory of algebras. 
A prominent role in the development of this branch of algebra was 
played (at the end of the century) by the Russian mathematician 

F. E. Molin (1861-1941). ^ ^ . 

Linear algebra attained great heights in the nineteenth century 
primarily due to the work of the English mathematicians Sylvester 
(1814-1897) and Cayley (1821-1895). Work continued on the algebra 
of polynomials; we note only the method of approximate solution 
of equations found by the Russian geometer N. I. Lobachevsky 
(1792-1856) and the work of the German Hurwitz (1859-1919). Alge- 
braic geometry was begun in the latter part of the nineteenth centur\ , 
particularly in the works of the German mathematician M. Noether 

(1844-1922). , , , , 

In the twentieth century, algebraic studies expanded considerab- 
ly and algebra, as we already know, occupies a very special place 
of honour in mathematics. New divisions of algebra liave sprung 
up, including the general theory of fields (in the 1910’s), the theory 
of rings and the general theory of groups (1920’s), topological algebra 
and lattice theory (1930’s), the theory of semigroups and the theory 
of quasigroups, the theory of universal algebras, homological algebra, 
the theory of categories (all in the 1940’s and 1950’s). Prominent 
mathematicians are presently engaged in all spheres of algebra, and 
in a number of countries (in the Soviet Union, for example) whole 

schools of algebra are in evidence. 

Among the prerevolutionary Russian algebraists, noteworthy 
contributions to algebra were also made by S.O. Shatunovsky 
(1859-1929) and D. A. Grave (1863-1939). However, it was only 
after the Great October Revolution of 1917 that algebraic investiga- 
tions in the Soviet Union reached high peaks. These studies now 
embrace practically all divisions of modern algebraic science and 
in some the work of Soviet algebraists is of a leading nature. Suffice 
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it to name only two algebraists: N. G. Chebotarev (1894-1947), who 
worked in the theory of fields and Lie groups, and 0. Yu. Schmidt 
(1891-1956), the famous polar explorer who was also a noted algeb- 
raist and founded the Soviet school of group theory. 

We conclude this brief survey of the historical background and 
modern state of algebra with the remark that most of the fields of 
research mentioned here lie beyond the scope of the present course 
of higher algebra. The aim of the survey was to help the reader to 
find the proper place for this text in algebraic science as a whole 
within the edifice of mathematics. 



CHAPTER 1 


SYSTEMS 

OF LINEAR EQUATIONS. 
DETERMINANTS 


1. The Method of Successive Elimination of Unknowns 

We begin the course of higher algebra with a study of systems 
of first-degree equations in several unknowns or, to use the more 
common term, systems of linear equations* 

The theory of systems of linear equations serves as the foundation 
for a vast and important division of algebra— linear algebra— to 
which a good portion of this book is devoted (the first three chapters 
in particular). The coefficients of the equations considered in these 
three chapters, the values of the unknowns and, generally, all num- 
bers that will be encountered are to be considered real. Incidentally, 
all the material of these three chapters is readily extendable to the 
case of arbitrary complex numbers which arc familiar from elemen- 
tary mathematics. 

In contrast to elementary algebra, we will study systems with 
an arbitrary number of equations and unknowns; at times, the 
number of equations of a system will not even be assumed to coincide 
with the number of unknowns. Suppose we have a .system of s linear 
equations in n unknowns. Let us agree to use the following symbo- 
li.sm: the unknowns will be denoted by x and subscripts: xi, Xj, . . . 

. . ., X ; we will consider the equations to bo enumerated thus: 
first, second, . . ., sth; the coefficient of x; in the ith equation will 
be given as a{^**. Finally, the constant term of the ith equation will 
be indicated as hj. 


• The term “linoar" stems from analytic geometry, where a first-degree 
equation in two unknowns defines a straight line in a plane. 

•• We thus use two subscripts, the first indicates the position number of 
the equation, the second the position number of the unknown. They are to be 
read: an “a sub one one" ana not “a eleven"; 03* “a sub three four" and not 
“a thirty-four”, and are not separated by a comma. 
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Our system of equations will now be written as follows: 

^22^2 ”•••"!' ®2n^n ^2* 

UjlX| "4” ^s2'^2 ‘ * ■ I" ^ sn^lt ^8 

The coefficients of the unknowns form a rectangular array: 




•called a matrix of s rows and n columns; the numbers au are termed 
elements of the matrix.* If s=n (which means the number of rows 
is equal to the number of columns), then the matrix is called a square 
matrix of order n. The diagonal of the matrix from upper left corner 
to lower right corner (i.e., composed of the elements flij, < 202 * • • •’ 
is called the principal diagonal. We call a square matrix of order 
n a unit matrix of order n if all the elements of its principal diagonal 
are equal to unity and all other elements are zero. 

The solution of the system of linear equations (1) is a set of n 
numbers A‘i. kn such that each of the equations (1) becomes 

an identity upon substitution of the corresponding numbers A:,-, 
i = i, 2, . . n for the unknowns Xj.** 

A system of linear equations may not have any solutions; it is 
then called inconsistent. Such, for example, is the system 

Xi 5x2 “ 

xi + Sxo = 7 

The left members of these equations coincide, but the right members 
are different and so no set of values of the unknowns can satisfy 
both equations simultaneously. 

If a .‘system of linear equations has solutions, it is termed con- 
sistent. A consistent system is called determinate if it has a unique 
solution— only such are considered in elementary algebra— and inde- 
terminate if there are more solutions than one. As we shall learn 
later on. there may even be an infinity of solutions. For instance, 


^ Thus, if the •matri.’s (2) is regarded by itself (not connected with the 
system (1)), then the first subscript of element a,-i indicates the number of the 
row, the second the number of the column at the intersection of which the 
olerpont is positioned. 

•• We stress the fact that the numbers A-j, A-j, » . .. constitute a single 
solution of the system and not n solutions. 
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the system 

xi -f 2x2 ^ 1 

Xj + Xj =4 J 

is determinate: it has the solution Xi = 1, Xj = 3 and, as may readily 
be verified by the method of elimination, this solution is unique. 
On the other hand, the system 

3xi — xj = 1, 1 

6xi — 2x2 = - J 

is indeterminate since it has infinitely many solutions of the form 

xi — k, Xo = 3A- — 1 (3) 


where k is an arbitrary number; the solutions obtained using for- 
mulas (3) exhaust the solutions of the system. 

The problem of the theory of systems of linear equations consists 
in'elaborating methods to determine whether a given system of equa- 
tions is consistent or not and, in the case of consistency, to establish 
the number of solutions and also to indicate a procedure for finding 
the solutions. 

We begin with the most convenient practical method for finding 
solutions to systems with numerical coefficients, namely, the method 
of successive elimination of unknowns, or Gauss' method. 

First, a preliminary remark. In future we will manipulate systems 
of equations in the following manner: both members of one of the 
equations of the system multtolied by one and the same number will 
be subtracted from the corresponding members of some other equation 
of the system. For the sake of definiteness, let us subtract both 
members of the first equation of system (1), multiplied by a number 
c, from the corresponding members of the second equation. We obtain 
a new system of linear equations: 


a\\X\ \ a^nXn 4 . . . “ 


^31^1 i* ^32^2 ,■ • • • a^nXn 

®sl^i * ‘ ‘ ^ sH^n 



where 

Qof — a 2 j ~~ for j — 1 , 2, . . n, b 2 — b.. 




The systems (7) and {4) are equivalent, which is to say they are 
either both inconsistent or they are both consistent and have the same 
solutions. Indeed, let ki, k^, . . A„ be an arbitrary solution of 
system (1). Obviously, these numbers satisfy all the equations of (4) 
except the second. However, they likewise satisfy the second equa- 


2— 98C 
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tion of the system (4). It will suffice to recall how this equation 
is expressed in terms of the second and first equations of system (1). 
Conversely, any solution of (4) will also satisfy (1). Indeed, the 
second equation of (1) is obtained by subtracting, from both members 
of the second equation of (4), the corresponding members of the 
first equation of the system multiplied by the number —c. 

Quite naturally, if manipulations of this kind are applied several 
times to system (1), the newly obtained system of equations will remain, 
equivalent to the original system (1). 

It may happen that as a result of such manipulations, there 
will appear in our system an equation whose coefficients in the 
left-hand member are equal to zero. Now if the constant term of this 
equation is zero, then the equation is satisfied for any values of the 
unknowns and so by discarding this equation we arrive at a system of 
equations equivalent to the original system. But if the constant term 
of the equation at hand is nonzero, then the equation cannot be 
satisfied for any values of the unknowns and for this reason the system 
obtained {and the equivalent original system as well) will be inconsistent. 

Let us now examine Gauss’ method. 

We are given an arbitrary system of linear equations (1). To be 
specific, suppt)se that the coefficient though in reality it 

may of cour.se 1 m* equal to zero and then we would have to start with 
some otlier, nonzero, coefficient of the fir.sl equation of the system. 

Let u.s now transform system (1) by eliminating the unknown 
from all equations excejit the first. To do this, multiply both mem- 


bers of the fir.':| equation bv the number — and subtract from the 

‘^11 

corresjioiidiiic members of the second equation; then subtract both 


members of Ibe first equation. 


multiplied bv , from the corre- 


sjiondiiig members 
We thus arrive 
in n unknowns: 


a 


1-Cl 


of the third equation, and so on. 

at a new .'ivslem made uj) of s linear equations 


li H .t.i 1 

a- . 

• « 4 


^'1. 

it .> 




b:. 

to 


"03 O 




• • « 

• • • • 

« » • 

• • 1 

• • « • 

1/ 

* J 


\\ e do iiol iiei'd to wriU* out 
<•o^'f^K•ienl.'^ a'. ;iinl the new 
and ;i lit Icrni.' «if tin* 

A-' we know, tie- svslein 
.Now tran>form We nci 


e.xplicilly (lie expre.<sions of the new 
eoiislanl terms b', via the coefficients 
original system (1). 
of equations (.a) i.s equivalent to (1). 
longer involve the first equation and 
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manipulate only that portion of (5) consisting of all equations 
except the first. We of course assume that there are no equations 
with all coefficients of the left members zero (such would have been 
rejected if their constant terms were likewise zero, and if that were 
not so we would have proved the inconsistency of our system). 
Thus, among the coefficients aij there arc some different from zero; 
for definiteness, we put ^ 0. Now transform (5) by subtracting 
from both members of Ihe third and of each of the succeeding equa- 
tions both members of the second equation multiplied respectiveiy 

by the numbers 

«32 <^4? 




In this way we eliminate the unknown x. from all equations, except 
the first and second, and arrive at the following sy.stem of equations 
which is equivalent to (5) and hence to (1): 

Q\\X\ "t" 

072^2 - 


0 1 >r3 

+ • • • 


I'l. 


^23^3 

-i- 

*** » • • 

■■■ ^ 

K- 



“h • • • 


I’r 

1 

• 

^ 7 ^ 3 X 3 

» « • « 

■ 1 ' • • • 

« • • « • 

# 

1^1 , 


equal 

ions, t 

A 

• ; s, since some > 

k 

of the equa- 

it 


tions were possiniy ui-sc.iiucu. ...v 

of the system could already have diminished after eliminating 
the unknown x,. Subsequently, oiily a portion of the system obtained 
(that containing all equations except the first two) will he subject 

to transformations. .... 

The question arises as to when tins proce.ss of successive elimi- 
nation of unknowns will stop. ... 

If we arrive at a system in which one of the equations has a non- 
zero constant term and all the coefficients of the left member are 
equal to zero, then, as we know, our original system was incon- 

Vhat is not the case, then we obtain the following system of 
equations which is equivalent to system (1). 

fljlX, -\-a,2'^2 - • * "i ^ 1 . ^ih^h + • • • 

(^22^2 '^ • • • + • • • "I" ^2n^n — ^2. 


I^c-i -h “i'-'nVi, + . • • + = brr', 


(h-2) 


ai.h 




1 I”— I Iv' 

Xfc -!'••• “j“ Oim T — Oh 


(6) 


2 * 
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Hero a’,, # 0 #0. a'V =5^ 0. Note also that 

k C s, and, obviously, A: < n. 

In this case system (1) is consistent. It will be determinate for k = n 

and indeterminate for k <.n. 

Indeed, if k = n, then system (6) has the form 


^ 22^2 




= 6: 


2» 


( 7 ) 


^nn 


4 ,„-„ 


From the last equation we obtain a quite definite value for the 
iinUiiown Xn- Substituting it into the next to the last equation, we 
find a uniquely defined value for the unknown Xn-i- Continuing in 
siniiliir fashion, we find that system (7) and, for this reason, system 
(1) as well have a unique solution, that is to say, they are consistent 
and determinate. 

lUit if k<in, for the “free" unknowns we take 

arbitrary numerical values, then, moving, in system (6) from liot- 
tom to top, we find quite definite values for the unknowns 

Xh. Xu-\ .Vj, xi (as above). Since the values for the free. 

unknowns may ht* chosen in an infinity of ways, our system (6) and, 
iienci-, (1) as well are consistent but indeterminate. It is easy to 
verify that by using the foregoing method (given all possible choices of 
values for the free unknowns) we can find all the .solutions of system (1). 

.\t lirsl glance, yet another form to which a system of linear 
('qualions may be rcduceil by tlie Caussian method would appear 
possible, namely, the form obtained by adjoining to system (7) a num- 
ber of equations containing only the unknown Xn- Actually, however, 
in this cajic the I ranshirmations have simply not been completed: 
since 7^= (), I he unknown x^ may be eliminated in all equations 

from the (/? i)th on. 

Note tliat tli(“ “Iriangular" form of llie system of equations (7) 
or the “Irapezoidal' form of system ((>) (for k /i) resulted from the 
a.ssuinplion that thecoefficieiils rtn, a'^.. idc. are different from zero. 
In the general ca.se. the sy.'^tem of equations which we arrive at after 
compleling the jiroress of elimination of nnkmtwns lakes on a trian- 
gular or trajiezoidal form only after an appropriate alteration in the 
nniiibering of the unknowns. 

lo summarize, then, we lind that the Gaussian method is applicable 
to ainj system <»/ linear e(iiiations. I'he system is inconsistent if after 
the Iranshrmutinns wr obtain an equation in which the coefficients of 
all unknowns arc zero and the constant term is nonzero', but if no such 
equation is encountered, the system is consistent. A consistent system of 
equations is determinate if it reduces to (he triangular form (7) and 
indeterminate if it reduces to the trapezoidal form (6) for k <C. n. 
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Let US apply what has been said to the case of a system of homo- 
geneous linear equations, that is, equations whose constant terms 
are zero Such a system is always consistent since it has a zero solu- 
tion (0, 0, . . 0). Suppose that in the system at hand the number 

of equations is less than the number of unknowns. Then our system 
cannot reduce to the triangular form since in the Gaussian elimina- 
tion process the number of equations of the system can diminish 
but not increase; hence, it reduces to the trapezoidal form and so 
is indeterminate. 

To put it otherwise, if in a system of homogeneous linear equations 
the number of equations is less than the number of unknowns, then this 
system has, in addition to the zero solution, nonzero solutiofis, that is, 
solutions in which the values of some (or even all) unknowns^are 

nonzero. There is an infinity of such solutions. 

In practical solutions of a system of linear equations by the 
Gaussian method, one should write down the matrix of the coefli- 
cients of the system and adjoin a column made up of the constant 
terms, which, for the sake of convenience, are separated by a vertical 
line, and then perform all the manipulations on the rows of this 

“augmented” matrix. 


Example 1. Solve the system 

ii -h 2 t 2 + 5i3 = 
Ii — ^2 + 3^3 = 
Zxi — 6x2 — a-3 = 


-9. 

2 , 

25 





Transform the augmented matrix of the sys'tem. 

1 2 6|-9\ /I 2 

0-3-2 
0 -12 -16 

We thus arrive at the following system of equations: 

Xj 2x2 + 3x3 = — 9, I 

-3X2 -2X3= 

- 8X3 = 8 ) 

which has the unique solution 

X, = 2, X2 == — 3, X3 = — 1 




The original system proved to be determinate. 
Example 2. Solve the system 

xj — 5 x 2 — 8x3 + 14 = 3 , I 

3X| -j- X 2 — 3x3 — 6 x 4 = ^ ’ I 

xj — 7x3 + 2 x 4 = —5, 

11X2+20X3-9X4= 2 
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We transform the augmented matrix of tlie system: 

1-5-8 i 
0 16 21 -8 
0 5 11 

0 11 20 —9 
1-5-8 1 

0 -89 0 -29 

0 5 11 

0 0 0 0 






\Vi‘ arrive at a system containing the equation 0=2. Consequently, the original 
system is inconsistent. 

Example 3. Solve the system 

kXi Xz — 3j 3 — X; = 0, j 

2j| H' 3x; -j- xa — 5xi = 0, I 

Xj — 2x2 — 3 x.t -i- 3xt = 0 1 


This is a system of homogeneous equations, and the number of equations 
is less than tlie mnnlier of unknowns; it must tlierefore be indeterraitiate. Since 
all the constant terms are zero, wo perforin manipulations solely with the mat- 
rix of the coeflicients of the system: 



We arrive at the following system of eiiuatious: 

2 x 2 - - 2x. - 0, 

7x2 -!- 0x3 — llx' = 0, 
Xi — 2x2 ■ 2 X 3 — 3x4 -- 0 


lx I 
the 




t ll 


>'an take either orn* of the unknowns .r2 or x; for llie free unknown. 
a. Then (r>mi lli(‘ lir*l eciuathm it follows that X2 = a, and from 

xl emialioii We get x.t = -^ a and, I'mally, from the third equation xi = 




Thu.'^, 


1 

-:r a. a. -z- a., n 

•> i> 


is till- geiiiTal form (T the .solulioiis (tf the given system of equations. 


2. r»''f.>i-mi!uinLs of SocoihI and Tliird Order 

Ihi* li.ethod of solving syslonis of linear equations given in 
1 is exireinely .simple and requires the performance of the same 
Ixitid of coiiiimtalions. wiiicli are readily carried out on computing 
macliiiies. Its drawluick. however, is tlial it does not enable us to 
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state the conditions of consistency or determinacy of the system by 
means of coefficients and constant terms of the system. On the other 
hand even in the case of a determinate system, this method does 
not permit finding formulas that express the solution of the system 
in terms of its coefficients and constant terms. However, all this 
proves to be necessary in theoretical problems, in particular in geo- 
metrical investigations; for this reason, the theory of systems of 
linear equations has to be elaborated by different and naore Profound 
methods. The general case will be pursued in the next chapter, for 
the present we^ consider determinate systems having an equal num- 
ber of equations and unknowns. We begin with the systems in two 

and three unknowns of elementary algebra. 

Let there be given a system of two linear equations in two unknowns. 


fliiXi -{- 012X2 — 1 

^21^1 T ^22^2 ^2 i 


( 1 ) 


whose coefficients form the second-order square matrix 



a\2 


a 


( 2 ) 


22 


Applying to system (1) the method of equalizing the coefficients, 
we obtain 


1^22 

(2 J2^2i) 

Xt 

— 

^1^22 — 

0 12^2' 

11^22 


X 2 


Olt^s 

biOoi 

'n 022 



0. 

Then 


^l“22 

— Oizh 


Jo 

011^2- 

-61021 

0^022 


f 

«■ 


— aj2a;' 


(3) 


It is easy to show, by substituting tlie values of the unknowns into 
(\) that (3) is a solution of system (1). The question of the unique- 
ness of this solution will be considered in Sec 7 

The common denominator of the values of the unknowns (d) is 
very simply expressed in terms of the elements of matrix (2): it is 
equal to the product of the elements of the principal diagonal minus 
the product of the elements of the secondary diagonal, ihis number 
is called the determinant of the matrix (2); we call it a second-order 
determinant since the matrix (2) is a second-order matrix. To symbo- 
lize a determinant, we use vertical lines in place of parentheses: 


an 

^21 


a, 2 
022 




( 4 ) 
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Examples. 

( 1 ) 

( 2 ) 




= 1 . 5 -(- 2 ). 3 = 11 


It is worth stressing once again, that while a matrix is an array 
of numbers, a determinant is a number associated in a definite way 
with a square matrix. The products ^ 12^21 ar® called the 

terms of a second-order determinant. 

The numerators of expressions (3) have the same form as the 
denominators, that is, they are also determinants of second order; 
the numerator of the expression for Xi is the determinant of the 
matrix obtained from matrix ( 2 ) by replacing its first column by the 
column of constant terms of system ( 1 ), the numerator of the expres- 
sion for ^2 is the determinant of the matrix obtained from matrix 
{2} by replacing its second column. We can now write formula (3) 
as follows: 


^>1 

(l\2 


«11 

61 

62 




^2 



" t .r«> — • 

<^11 

^12 

^21 



(Jnj 



This rule for the solution of a system of two linear equations 
in two unknown.s (called Cramer's rule) is formulated as follows. 

If the determinant, (4), oj the coefficients of a system of equations, 
( 1 ), is different from zero, we obtain the solution of system ( 1 ) by taking 
for the values of the unknowns the fractions whose common denomiiiator 
is determinant (4) and whose numerator for the unknown Xi (i = 1. 2) 
is a determinant obtained by replacing in determinant ( 4 ) the ith column 
{lhat is, the column of coefficients of the desired unknown) by the column 
of the constant terms of system ( 1 ).* 

E.xaniplc. Solve the system 

2 j, -i- a:, = 7 , -) 

X, - = _2 J 

The dolerminaiit of the coofticients Is 



It is different from zero and, for this reason 
1 he cU'terminanls 


Cramer’s rule is applicable. 





In th.vt'Til'vv'''' of roplacinp columns “in the determinant”. 

ioin n of \\l ‘ convenient, speak of rows and 

column> of a determinant, of its elements and dia-^oiiaN etc 
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are the numerators for the unknowns. Thus, the following'set of numbers is the 
solution of our system: 



The introduction of second-order determinants does not sub- 
stantially simplify the solution of a system of two linear equations 
in two unknowns, which does not present any difficulties as it is. 
However, for the case of systems of three linear equations in three 
unknowns, similar methods are of practical utility. Suppose we 
have a system 

OiiXi -r ^12^2 + 013-^3 = ') 

a.iXi -f a^oX 2 -r ^ 23^3 = ^ 2 ’ f 
+ (^ 32^2 "f" ^33-2^3 = bj J 
with the coefficient matri.v 

( flu 

floi 

flat 

If we multiply both sides of the first equation of (6) by the num- 
ber ao2«33 — «23“32> sccood cqualion by 013332 “ 

— <212^^331 both’ sides of the third equation by ajsOoa — <2i3<222’ 
then add all three equations, it is easy to verify that the coefficients 
of x<. and xs will turn out to be zero, that is, these unknowns are 
elim'inated simultaneously and we obtain the equation 

(011022^233 + Oi 2 fl 23 < 23 l + 013021032 — 2Z13O22O31 — O12O21O33 
— 011023032) Xi = bia^-fizz -f- ayza^zbz + 0,362030 — 01305263 

— 0,260033 — 6,003032 (8) 

Here, the coefficient of or, is called a third-order determinant cor- 
responding to matrix (7). The symbolism is the same as in the case 
of second-order determinants; thus, 


"u 

O 12 

^2i3 



021 

022 

OoS 

— ^11^22^33 ^ ^12^23^31 H” ^13^21^32 

(9) 

2231 

032 

2233 

— ^13^22^31 ^12^21^33 — ^11^23^32 



The expression for a third-order determinant is rather involved, 
but the rule for its formation from the elements of matrix (7) is extre- 
mely simple, as witness: one of the three terms (of the determinant) 
in (9) with the plus sign is the product of the elements of the prin- 
cipal diagonal, each of the other two is a product of the elements 
lying parallel to this diagonal, with the third factor added from 
the opposite corner of the matrix. The terms with the minus sign 
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in (9) are constructed in a similar manner but relative to the secon- 
dary diagonal. We obtain a technique for computing determinants 
of the third order that produces quick results (after a certain amount 




Fig. 1 


of practice). Fig. 1 gives a schematic view of computing the positive 
terms (left) and the negative terms (right) of a third -order deter- 
minant. 




FLxaniples. 


2 1 2 

3 1 
2 3 .3 


I 0 



= 2-.3-,3 - 1 -1.2 2q-'.)-3 

- 2 -.3 -2 I •( - i)-5 — 2- 1-3 

2 - 2 I - 12 - 20 - 6 = 




•) 


U 


I •3.0 T (1.2-1 - (-.■o.(-2i.(-2) 

( -.'))-3-l — 0-( -2)-n — 1 -2 -{-2) 

-20 -- 1 .') - I = -1 


Tbo riglil-li.uid side of (S) is also a lliiril-order determinant, name- 
ly. Uie determinant of llie matri.x obtained from matrix (7) by 
replacing its lirst column by the column of constant terms of system 
(li). If W' denoti' determinant (9) by the letter d and the determinant 
obtained by replacing its /lb column (/ - 1. 2, 3) by the column 
of constant ti'i’tns of system (ti) by the syml)ol dj, then equation (8) 
ln'CoiiK's d.r, - /j. wlieuee. for d 0, it follows that 


In exactly the .same way. by multiplying equation (d) by the 
[Hi]nl-''rs (f .3a respec- 

tively. we ulita’n h r .r., the billowing expression (again for d 0): 



Kiiially. mnllijd ioLMlie.-e equations, respectively. 

','1 '?;i — OyO'y fi-itty. -- we arrive at the expression for .r2' 



( 12 ) 
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Substituting expressions (10) to (12) into equation (6) (it is 
of course assumed that the determinants d and all dj are written 
in expanded form), we would find— after cumbersome computations, 
all however, well within the grasp of the reader— that all these 
equations are satisfied, that is, that the numbers (10)-(12) consti- 
tute the solution of system (6). Thus, if the determinant of the coef- 
ficients of a system of three linear equations in three unknowns is nonzero, 
then the solution of this system may be found by Cramer's rule as stated 
jor the case of a systeTti of two eyucitioiis. In Ssc. / the reader will find 
a different proof of this assertion (one that does not rely on the cal- 
culations we have omitted here) and also a proof of the uniqueness 
of the solution (10)-(12) of system (6) for the more general case. 


Example. Solve the system of equations 

2xi — I; + ^3 = 0, \ 

3zi 2x2 — ^^3 —it I 

I, -|- 3i2 — 2x3 = 4 1 

The determinant of the coefficients is nonzero: 

2 -1 1 
3 2 -5 = 28 

1 3 -2 

so the Cramer rule is applicable. The numerators for the unknowns are 


d.= 


0 -1 1 


2 0 1 

1 2 -5 

= 13. c/3 = 

3 1 -5 

4 3 -2 


1 4 -2 


= 47, 


d. = 


2-10 
3 2 1 

1 3 4 


= 21 


Hence, the following mimber.s constitute the solution of the system: 

13 

^‘“28’ ■^=^"28’ 28 4 


3. Arrangements and Permutations 

In the study of determinants of order n we will need certain 
concepts and facts relating to finite sets. Suppose we have a certain 
finite set M consisting of n elements, which may be enumerated by 
using the natural numbers 1, 2, . . ., n\ since the properties of the 
elements of the set M will not play any role whatsoever, we simply 
say that the elements of M are the numbers 1, 2, . . ., n. 

Besides the natural order of 1, 2, . . ., n, we can arrange ihe 
numbers in many other ways. Thus, we can arrange the numbers 
1. 2, 3, 4 as 3, 1, 2, 4 or 2, 4, 1, 3 and so on. Every rearrangement 
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of the numbers 1, 2. . . w iu any definite order is called a per- 
mutation (or arrangement)* of n numbers (or n symbols). 

The number of distinct arrangements of n symbols is equal to the 
product 1-2 .. . n, denoted by n\ (read “/i factorial”). Indeed, the 
general form of an arrangement of n symbols is ii, U, . - ini where 
each of the is is one of the numbers 1,2, . . n, without repetitions. 
Use any one of the numbers 1, 2, . . ., n for i,; this yields n distinct 
possibilities. But if i, has been chosen, then for U we can only take 
one of the remaining n — i numbers; that is, the number of diffe- 
rent ways of choosing the symbols ij and U is equal to the product 
n {n — \) and so on. 

Thus, the number of arrangements of n symbols for ri = 2 is 2! — 
= 2 (the arrangements 12 and 21; in examples where we 

do not separate the symbols by commas); for n = 3 this number is 
3! = 0, for n = 4 it is 4! = 24. As n increases, the number of arran- 
gements increases very fa.st: for n = 5 it is 5! = 120, and for n = 10 
it is already 3,028,800. 

If in a certain arrangement we interchange any two symbols 
(not necessarily adjacent) and leave all the remaining ones fixed, we 
obtain a new aiTangemenl. This operation is called a transposition. 

All n\ arrangements of n symbols may be ordered so that each is ob- 
tained from the preceding one via a single transposition) any arrange- 
ment can serve as the starting point. 

This assertion holds true for n = 2: if it is required to begin 
witli the arrangement 12, the desired order will be 12, 21; if we 
begin with the arrangement 21. then the order will he 21. 12. Su))- 
po.'^e our as.«crtion has already been proved for n — 1, and we pro\e 
it for n. Let us begin with the arrangement 

ill izi • • •> in (^) 

We consider all arrangements of n symbols starting with ?i. There 
are (n — 1)! such arrangements and they may be ordered in accord 
with the requirements of the theorem, beginning with (l)’since this 
actually reduces to an ordering of all arrangements of n — 1 sym- 
bols; this ordering, by the induction hypothesis, may be initiated 
trom any arrangement, .«:ay. Lt • • •> in- lo the last of the arrange- 
ments of n symbols thus obtained we perform a transposition of ij 
.»nd any other symbol (say /j) and. again beginning with the arran- 
gement obtained, we appropriately order all the arrangements with 
io in first place, and so forth. It is thus obviously possible to enume- 
rate all arrangements of n symbols. 


=* Translator's note: the term -arraDgomcnl” will be used, since permuta- 
Uon is re.'^erved in this text for a different concept. 
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From this theorem it follows that it is possible to pass from any 
arrangement of n symbols to any other arrangement of the same sym- 
bols by means of several transpositions. 

We say that in a given arrangement the numbers i and j consti- 
tute an inversion if i > 7 but i comes before 7 in the arrangement. 
An arrangement is termed even if its symbols form an even number 
of inversions, otherwise it is odd. Thus, the arrangement 1, 2, . . n 
is even for any n since the number of inversions here is zero. The 
arrangement 451362 {n = 6 ) contains 8 inversions and so is even. 
The arrangement 38524671 (n = 8 ) contains 15 inversions and so is 
odd. 

Every transposition changes the parity of the arrangement. 

To prove this important theorem let us first consider the case 
where the symbols i and 7 being interchanged are adjacent: in other 
words, the arrangement is of the form ...» i, 7 , . . where the 
dots stand for symbols unaltered by the transposition. The trans- 
position converts our arrangement into the arrangement . . 7 . 

i it being understood that in both cases each of the symbols i. j 

constitutes the same set of inversions with the symbols which remain 
fixed. Whereas earlier i and 7 did not constitute an inversion, in the 
new arrangement there is a fresh inversion; hence, the number of 
inversions'^has increa.sed by unity; contrariwise, if they originally 
formed an inversion, then the inversion now vanishes, the number 
of inversions being diminished by unity. In both cases the parity 
of the arrangement is altered. 

Now let us suppose that there are s symbols, s > 0, between 
i and 7 ; that is, the arrangement is of the form 


. . ., /, Aj, ^2, • • •! hii< 7’ ■ • • 


( 2 ) 


The symbols i and ; may lie interchanged by means of a succession 
of 2s 1 transpo.sitions of adjacent elements. These arc transpo- 
•sitions interchanging the symbols i and A*i, then interchanging i 
(now in the place of ki) and k^, and so on until i occupies the site 
of symbol kg. These s transpositions are then followed by a trans- 
position that interchanges the symbols i and 7 and then s transposi- 
tions of the symbol 7 with all k's\ as a result, 7 occupies the place of 
i and the symbols k return to their original sites. Wo have thus 
changed the parity of the arrangement an odd number of times 
and for tliis reason the arrangements ( 2 ) and 


7 , A-,, A 2 , 


» * 


S' 


1, 


(3) 


are of diflerent parity. 

For n 2, the number of even arrangements of n symbols is equal 


to the number of odd arrangements, i.e.,-^ 
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Indeed, proceeding from the foregoing, order all arrangements 
of n symbols so that each one is obtained from the preceding one 
by a single transposition. Adjacent arrangements will have oppo- 
site parity, that is, the arrangements are ordered so that even and 
odd arrangements alternate. Our assertion now follows from tho 
obvious remark that for re 2 the number re! is even. 

Let us now define a new concept, that of a permutation of degree re. 
Write down two arrangements of re symbols, one under the other» 
and place parentheses around them; for example, for re = 5, 


/3 5 1 4 2\ 
If) 2 3 4 li 



III this e.xample,* 5 stands under 3, 2 under 5, etc. We say that 
number 3 goes into 5, 5 goes'into 2, 1 goes into 3, and the number 
4 goes into 4 (or remains fixed) and, finally, 2 goes into 1. Thus, 
two arrangements written one under the other in the form shown in 
(4) define a certain one-to-one mapping of the set of the first five natural 
numbers onto it.self, that is, a mapping in which each of the natu- 
ral numbers 1, 2, 3, 4, 5 is associated with one of these same natural 
numbers, distinct numbers corresponding to distinct numbers. And 
since there are only five numbers (a finite set), each one corres])onds 
to one of the five numbers 1, 2, 3, 4, 5. namely, that one into which 
it “goes”. 

It is clear that the one-to-one mapping of the set of the first 
five natural numbers which we obtained by means of (4) could bo 
obtained by writing certain other pairs of arrangements of five sym- 
bols one under the other. These are obtained from (4) by means of 
several transpositions of the columns, .«uch as, for instance, 

/2 1 3 4 \ /I r, 2 4 /2 3 i 4 :u 

'J’ \3214r,)’ 1,1 2 3 oj 

In all IIkw groups, 8 gnes into 5, 5 into 2* etc. 

J * two arrangements of n symbols written one under the 
other define a one-to-one mapping of the set of the first re natural 
numbers onto itself. Any one-to-one mapping .-1 of the set of the 
first // natural numbers onto ihsclf is termed a permutation of degreen. 
Obviously, any permutation .4 may be written with the help of two 
arrangements, written one under the other: 



Till 




ineaning 


IS 


army looks like a matrix of two rows and five columns, 
quite iliffcreiil. 


but its 


3. ARRANGEMENTS AND PERMUTATIONS 


3f 


Here, denotes the number into which i (i = 1, 2, . . n) goes 
in the permutation A. 

The permutation A possesses many different notations of the 
form (6). For instance, (4) and (5) are different ways of denoting 
one and the same permutation of degree 5. 

It is possible to pass from one mode of notation of the permuta- 
tion A to another simply by performing a number of transpositions 
of the columns. It is then possible to obtain (6) in a mode such that 
the upper (or lower) row is any preassigned arrangement of n symbols. 
In particular, any permutation A of degree n may be written as 


/I 2 . . . 

y 0C| CCo • • • 




that is, with the numbers in the upper row arranged in their natural 
order. Given this notation, various permutations differ in the arran- 
gements of the lower row, and for this reason the number of permuta- 
tions of degree n is equal to the number of arrangements of n symbols, 
or n\. 

An instance of an /tth-degree permutation is the identity permu- 
tation 

/I 2 ... n 

^ ^ (l 2 . . . n 


in which all symbols remain fixed. 

It is well to point out that the upper and lower rows of the per- 
mutation A in notation (6) play different roles so that if interchanged 
the result would be a different permutation. Thus, the permutations 
of degree 4 

4 3 1 2\ 

2 1 4 3j 

are different^ in the first, 2 goes into 4, in the second it goes into 3. 

Let us take some permutation A of degree n in the arbitrary 
notation (6). The arrangements constituting the upper and lower 
rows in this mode can have either identical or opposite parities. 
As we know, we can proceed to any other mode of permutation A by 
means of successive transpositions in the upper row and correspond- 
ing transpositions in the lower row. However, by performing one 
transposition in the upper row of (6) and one transposition of llie 
corresponding elements in the lower row, we simultaneously alter 
the parities of both arrangements and therefore preserve the coinci- 
dent or opposite nature of these parities. From this it follows that 
in all modes of notation of the permutation A, the parities of the upper 
and lower rows either coincide or are opposite. In the former case, A is 
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called even, in the latter, odd. In particular, the identity permuta- 
tion is even. 

If the permutation A is written as (7) (that is, with the even arran- 
gement 1, 2, . . n in the upper row), then the parity of permuta- 
tion ^ is determined by the parity of the arrangement aj, a^, . . (Xn 
of the lower row. Whence it follows that the number of even permuta- 

lions of degree n is equal to the number of oddpermutations, that is, y n!. 

The definition of parity of a permutation may be cast in the follo- 
wing modified form. If, when written in mode (6), the parities of 
both rows coincide, then the number of inversions is either even 
in both rows or is odd in both, that is, the total number of inversions 
in both rows of (6) is even; but if the parities of the rows in mode 
(6) are opposite, then the total number of inversions in these two 
rows is odd. Thus, permutation A is even if the total number of inver- 
sions in the two rows in any mode of notation is even, it is odd otherwise. 

Example. Lot there be given a permutation of degree 5: 

pi 4 5 2x 
'2 5 4 3 1/ 

There arc 4 inversions iu the upper row, and 7 inversions in the lower row. 
The total number in the two rows is 11, and so the permutation is odd. 

Ilewrile this permutation as 

/I 2 3 4 5\ 

^5 1 2 4 3/ 

The number of inversions in the upper row is 0, in the lower, 5; that is, the 
totiil number is again odd. Though the modes of notation differ, the permuta- 
tions preserve the parity of the total number of inversions, but not the actual 
number of tliem. 


We wish to indicate other ways, equivalent to those given above, 
of defining parities of permutations.* For this purpose we define 
multiplication of permutations, which is of great interest in itself. 
As we already know, a permutation of degree n is a one-to-one map- 
ping of the set of numbers 1. 2, . . ., n onto itself. The result of a suc- 
cessive o.xecution of two one-to-one mappings of the set 1, 2, . . ., n 
onto itself will obviously again be a certain one-to-one mapping of 
the set onto itself, that is to say, a successive execution of two permu- 
tations of degree n loads to a certain very definite third permutation 
of degree n called the product of the first by the second. Thus, if we 
have the permutations of degree four, 





• This material may be omitted in a first reading since it will be required 
only in Chapter 14. 
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then 



In the permutation A, the symbol I goes into 3, but in B the symbol 
3 goes into 4, and so for AB the symbol 1 goes into 4, etc. 

Multiplication is only possible with permutations of the same 
degree. Multiplication of permutations of degree n for n > 3 is non- 
commutative. Indeed, using A and 5, the product BA yields 



which shows that the permutation BA difiers from the permutation 
AB Such examples may be chosen for all n, n > 3. although for 
certain pairs of permutations, commutativity may accidentally be 

\ke multiplication of permutations is associatiue; that is. we can 
sneak of the product of any finite number of permutations of degree 
n taken {because of noncommutativity) in a definite order. Let there 
be given permutations B and C and let the symbol 1 < i, < ». 
go to u in u to 13 in ^ nnd to in the perinutalion C. riien in 
the permutation AB. i. goes to 13, in BC the symbol 1, goes to ^ 
and therefore the symbol i, goe.s to 14 whether we perforin C 

U^i^ obvious that the product of any permutation A by the identity 
permutation E {and also the product of E by A) is equal to A : 

AE = EA = .1 


Let u.s now define the inverse of the pernuitalion .4 as tlie permuta- 
tion A-^ of the same degree such tliat 

AA-^ = A-hi = E 

It is easy to see that the inverse of 

/I 2 . . . n \ 

yCCj ) 

is the permutation 

( CL{ 0^2 • • • ^n\ 

1 2 ...n) 

obtained from A by interchanging the upper and lo^yer rows. 

Let us now examine permutations of a special kind which are 
obtained from the identity permutation E by moans of a single 
transposition performed in the lower row. Such permutations are 


3-y8C 
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odd: they are termed transpositions and are of the form 



where the dots stand for symbols that remain fixed. Let us agree 
to denote this transposition by the symbol (i, /). Application of the 
transposition of symbols i, j to the lower row of (7) of an arbitrary 
permutation .^1 is equivalent to multiplying A on the right by the 
permutation (8), that is by (/, /). Wo know that all arrangements 
of n symbols may be obtained from one of them, say from 1. 2, . . . 

. . n, by successive transpo-sitions, and so any permutation may 
be obtained from the identity permutation by successive transposi- 
tions in the lower row, that is, by successive multiplication by per- 
mutations of the form (8). It can therefore be asserted (omitling 
the factor E) that anij permutation can be represented as a product of 
transpositions. 

Any permutation may be factored into a product of transposi- 
tions in many different ways. It is always possible, for example, to 
add two identical factors of the form {i, f) (f, ;), which when mul- 
tiplied yield E, that is to soy, cancel out. Lot us take a somewlial 
loss trivial instance: 

f V 

^ ’’j - (vi) {!.->) (ro = d d (24) m (34) (i3) 


This new way of defining the parity of a permutation is based 
on th(‘ following theorem. 

For nil jfjctorizations of a permutation into a product of transpo- 
sitions. the parity of the number of these transpositions is the same and 
coincides u-ilh the parity of Ihe permutation. 

Thus, in liie example given above, the permutation is odd, as 
may also be varified by counting the number of inversions. 

This (heorem \\ill be proved if we demonstrate that the product 
of ary 1; transpositions is a permutation whose parity coincides with 
the parity of the number h. For /.* •- - 1 this is true because a transpo- 
siMm, is an odd iierimilalion. Lei our a.sserl’on be proved for the 
Cc'u e 111 I: — 1 faclor.s, I hen its validity for A* factors follow.s from 
'lie Kill that [he iiiimhers k — 1 and k an' of opposite parity and 
111 .' iiiiillijilicalioii of a })ermnlalion (in this case, the prodnet of 
:!i(‘ liivt k - 1 factor.'^) by a Iransposition is eijuivalent to this trans- 
p.isilion performed in tlie lower row of the permutation, which 
is to say. it eluiiiLn’s Ihe parity. 

1). Composition into cycles is a conveniciil way of writing permu- 
tatiniis v.'liicli makes it lasy to tiiid their parity. Any permutation 

of '.legive n can leave certain symbols 1. 2 n fixed while 

MiovitiLT <ither.-, .t cyclic permutation (or, simply, a cycle) is a permu- 
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tation such that when it is repeated a sufficient number of times 
any one of the symbols can he transformed into any other symbol. 
Such, for instance, is the permutation of degree eight 

/I 2 3 4 5 6 7 8\ 

VI 8 6 4 5 2 7 3/ 

It transfers the symbols 2, 3. 6, and 8, with 2 going into 8, 8 into 
3. 3 into 6, and 0 again into 2. 

All transpositions belong to cycles. By analogy with the earlier 
used abbreviated notation for transpositions, the following notation 
is used for cycles: the symbols being transferred are enclosed in 
parentheses in the order in which they go into one another when the 
permutation is repeated; any transferable symbol can serve as the 
starting point, and the last one is that which goes into the first. 
Thus, for the example given above, this notation has the form 

(2 8 3 6) 


The number of symbols transferred by a cycle is called the cycle 
length. 

Two cycles of degree n are called disjoint if they do not have any 
common symbols subject to transfer. It is clear that in multiplica- 
tion of disjoint cycles, the order of the factors does not affect the 
result. 

Any permutation can be factored unUptely into a product of pair- 
wise disjoint ci/cles. The proof is simple and so we omit it. In actual 
practice, the factorization is accomplished in the following manner: 
begin with any one of the symbols subject to transfer, write out 
tho.se symbols into which it goes in a new permutation until you 
arrive at the original symbol. After tlius “closing” tlie cycle, begin 
with one of the remaining transferable symbols to ol)t.ain tlie second 
cycle, and so on. 

Examples. 

( 1 ) 

( 2 ) 


/I 2 3 4 5\ 
\3 5 1 2 4/ 
1 2 3 4 5 6 7 8 


= (13) (254) 


\5 2 8 7 6 1 4 3/ mm; 


Conversely, for any permuliilion sjiccifu-d by a dcrompo.'^itjon into disjoint 
cycles, it is possible to find a notatiiin in ordinaiy form, provided that the 
degree of tlie perniutation is kiiowji. For example, 



(1372) (45) 


( 


1 2 3 4 5 6 7 
3 1 7 5 4 G 2 


) 


if it is known that the permutation is of degree 7. 

Let there he given a permutation of degree n and let s he the number of 
disioint cycles in its decomposition plus the number of symbols which it holds 

3* 
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fixed*. The difference n — s is called the decrement of this permutation. The 
decrement is obviously equal to the number of actually transferable symbols 
diminished by the number of disjoint cycles entering into the decomposition 
of the permutation. For Examples 1, 2, and 3 above, the decrement will be 
equal to 3, 4. and 4, respectively. 

The parity of a permutation coincides with the parity of the decrement of the 
permutation. 

Indeed, any cycle of length k may be represented in the following manner 
as the product of k — I transpositions: 

{iit *21 • • iji) ^ (*ii *2) (li» *3) • • • {li* *ft). 

Let us now suppose we have an expansion of permutation A into disjoint cyc- 
les. If each one of the cycles is factored by the indicated method into a pro- 
duct of transpositions, we get a representation of permutation A in the form 
of a product of transpositions. The number of these transpositions will obviously 
he less than the number of symbols actually transferable by by a number 
equal to the number of disjoint cycles in the decomposition of the permutation. 
Whence it follows that the permutation A may be factored Into a product 
of transpositions whose number is eq\ial to the decrement, and for this reason 
tlie parity of the permutation is determined by the parity of the decrement. 


4. Deteriniiiants of 7ith Order 


We now wish to generalize the results obtained in Sec. 2 for w = 2 
and n = 3 to the case of an arbitrary n. For this purpose, we Iiave 
to introduce determinants of order n. However, it is not possible 
to do that the way we introduced determinants of order two and 
three, that i.s by solving a system of linear equations in the general 
form: as n increased, the computations would become progressively 
more unwieldy, and totally unmanageable for arbitrary n. We choose 
a different approach. Considering the determinants of order two 
and three whicli we are already familiar with, let us attempt to 
establish a general law expressing these determinants in terms of 
tile elements of the corresponding matrices, and then let us apply 
flint law as a definition for an nth-order determinant. After that we 
will prove tiint Cramer’s rule holds true under such a definition. 

Recall the expressions for determinants of order two and three: 


^11 

^21 




^1: 

(1 AO (1 

^33 


^1.3^22*^31 — *^12^21*^33 — *^11*^23*^32 


We see that any term of a second-order determinant is a product 
of two elemenis which lie in different rows and also in different co- 


♦ With every symbol which the permutation holds fixed It po^^ible 
to associate a “cycle'* of length 1, i.e., say, In Example 2 above we could write- 
(15G) (38) (47) (2). Put we sliall not do that. 
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lumns, and also that all products of this type that may be formed 
from the elements of a second-order matrix (two altogether) are 
utilized as terms of the determinant. Similarly, every term of a 
third-order determinant is a product of three elements, also taken 
one in each row and each column; again, all such products are utilized 
as terms of the determinant. 

Let us now take a square matrix of order n: 

^11 • • • ^ 1 ^ 

0 »y\ ^22 • • * 


^ni 




We consider all possible products of the n elements of this matrix 
located in diflereiit rows and diflerent columns, that is. products 
of the form 


where the subscripts ai, ao. . . constitute an arrangement of 

the numbers 1, 2, . . n. The number of such products is equal 
to the number of different arrangements of n symbols, or n\. We con- 
sider all these products as terras of the future nth-order determinant 
associated with the matrix (1). 

To determine the sign affixed to product (2) in the determinant, 
note that, using the subscripts of this product, we can form the 
permutation 

^ ^ • M (3) 

Ct J ^2 • • • / 



where i goes into a/ if an element in the ith row and aith column 
of matrix (1) enters into the product (2). Examining expressions 
of determinants of second and third order, we note that the plus 
sign is affixed to the terms whose subscripts constitute an even 
permutation, and the minus sign to those terms with an odd permu- 
tation of subscripts. It is also natural to retain this regularity in the 
definition of a determinant of order n. 

We thus arrive at the following definition: the nih-order deter- 
minant associated with matrix (1) is the algebraic sum of n! terms 
which is constructed in the following fashion: the terms are all 
possible products of the n elements of the matrix taken one in each 
row and each column, the term having a plus sign if its subscripts 
form an even permutation, and a minus sign otherwise. 

For the notation of the nth-order determinant associated with 
matrix (1) we will, as in the case of determinants of order two and 
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three, use the symbol 


Cli2 

^22 


♦ a 


in 


. a 


2n 




♦ a 


nn 


( 4 ) 


Determinants of the /ith order become determinants of order two 
and three, for « = 2 and n = 3; for n = i, that is, for matrices 
consisting of a single element, the determinant is equal to that 
element. So far we do not know whether it is possible, for > 3, 
to use the /Uh-order determinant for solving systems of linear equa- 
tions. riiat will be shown in Sec. 7. It \vill be necessary first to subject 
Uie nth-order determinants to a detailed study and, in particular, 
it will be necessary to find procedures for evaluating them, since 
to compute a determinant directly (via its definition), even for n 
not very large, would be extremely complicated. 

For the present let us establish some of the simpler properties 
of nth-order determinants that refer mainly to one of the two follow- 
ing problems; on the one hand, we are interested in the conditions 
under which a determinant is equal to zero, on the other, we will 
indicate certain matrix transformations which leave its determi- 
nant unchanged or result in readily perceivable alterations. 

The transpose operation with respect to matrix (1) is a transfor- 
mation of the matrix in which its rows become columns with the 

.‘^ame suh.^cripls; in other words, it is a tran.«ition ‘‘’•om matrix (1) 
to the matrix 


02 \ ... j 

**...« 

... 


(5) 


nr we ran say llial a t ransposit ion is obtained hv fliiijiing matrix (1) 
over the principal diagonal. .Accordingly, we say that the determinant 


1 «il 


■ ■ • ^n] 


0 .!«•* 

. . . 0^12 

1 


9 

9 9 

» • . . 

• • • ri 



oiilaiiied by taking the Iranspo.'^e of tlie determinant (4). 
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ProDerty 1. Taking the transpose does not change the determinant. 
Indeed, every term of determinant (4) is of the form 

0) 


®lai ®2a2 


a 


nan 


where the second subscripts form an arrangement of the symbols 
12 n However, all the factors of product {() remain in 

diSerent 'rows and diSerent columns in determinant (6) as well; 
hence (7) serves as a term of the transpose of the determinant loo. 
The converse is also obviously true and for this reason the deter- 
minants (4) and (6) consist of the same terms. The sign of the term 
“determinant (4) is determined by the parity of the permutation 


/I 2 • . • n \ 

Vg(j Cto . • • 


( 8 ) 


In determinant (6) the first subscripts of the elements indica e tlie 
LlLn, the second subscripts the row, and so term (<) in determi- 
nant (6) is associated with the permutation 


( CCf CCj • • • \ 

1 2 . . . n ) 




In the eencral case, the permutations (8) and (9) are different hut 
they obviously have the same parity and so term (/) has the same 
sign in both determinants. Thus the determinants (4) and (6) are 
sums of Iho same terms taken with the same signs, that is, they are 

^^""prom Property 1 it follows that any assertion about rows holds 
trurfor the colunms of a determinant and conversely in other words 
in contrast to a matrix, in u determinant the rows and columns are of 
Zual status We will therefore formulate and prove Properties 2 to 9 
Xly for the rows of a determinant; analogous properties for columns 

will not rcQuire special proof. 

Properly 2. If one of the rows of a determinant consists of zeros, 

Meed*"cf all th™elemcnt.s of the ith row of a determinant bo 
zeros Every term of the determinant must have, as a factor one 
clement of the ith row, and so in our case all the terms of the deter- 

Property S If a determinant is obtained from another one by 
interchanging two rows, then all terms of the first determinant will 
be terms of the second but with signs reversed; which means that inter- 
changing two rows of a determinant only changes the sign. 

Suppose in determinant (4), the ith and yth rows {i # /) are 
interchanged and all other rows remain fixed. We get the deter- 
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minant 


^11 

Cli2 

• • • ^in 


• 

* • 

(2 J.y 

9 

• • • fljfa 

(0 

# 

• • 

a;2 

• •• 4 

• ' • 

(;■) 

• 

U/.1 

« 

^/t2 

• • « • 

• • • ^nn 


on 

the 

right). If 

^lai 

^2a2 

• • • ^nan 


( 10 ) 


( 11 ) 


IS a term of {-i), then all its factors in (10) as well obviously remain 
in dillerent rows and different columns. Thus, determinants (4) 

and (10) consist of the same terms. Term (11) in determinant f4) is 
associated with the permutation 


( 1 “ . . , / . , . j , . . u \ 

CLi 0,2 •• • Ctj ... CCj . , . a„) 


( 12 ) 


and in determinant (10) with the permutation 

1 2 




“ • • • / " . . / . . , n \ 
. . . c<,- . . . . . . an/ 


(13) 


since, for example, element now lies in the ;th row but remains 

‘rl.‘ (13) however is obtained 

from (1_) via a single transposition in the upper row: it thus has 

opposite parity. Whence it follows that all terms of drt’erminant (4 

01 ^ 10 ! riih ^ D«‘<=™inants 4 

and (iU) differ in sign alone. ^ ' 

to '’*■ ■'* C'^oMning two identical rows is equal 

Indeed, let a determinant be equal to the number d and let 
the torresponding elements of its ith and /th rows (i /) be equal 

II ml wl’i7be '.mfn'' "''"'‘•'''‘'"'‘P' »>'■«<' t"o fO'vs, the detT-mi- 
m rc dd n i Btit since identical rows are 

mul’t?ZT^i f " '*'»""»• 0 / -'-nme row of a determinant are 

i :!u " determinant itself is a multip- 

het all elements of the HI, row be multiplied bv k. Each term 
' the determinant contains exactly one element if the ith row. 
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therefore every term acquires the factor k, which means the deter- 
minant itself is a multiple of k. 

This property admits of the following formulation as well: 
a common factor of all elements of some row of a determinant may be 
factored out of the determinant. 

Property 6. A determinant with two proportional rows is equal 
to zero. 

Let the elements of the fill row of a determinant differ from the 
corresponding elements of the ilh row (i ^ ;) by one and the same 
factor k. Factoring this common factor k out of the ;th row of the 
determinant, we obtain a determinant with two identical rows, which 
by Property 4 is zero. 

Property 4 (and also Property 2 for n > 1) is obviously a spe- 
cial case of Property 6 (for k = I and k = 0). 

Property 7. If all the elements of the ith row of a determinant 
of order n are given as a sum of two terms'. 

+ O- 7 = 1- ■ • 

then the determinant is equal to the sum of two determinants in which 
all rows (except the /th) are the same as in the given determinant and 
the ith row in one of the summands consists of the elements bj and in 
the other, of the elements cj. 

Indeed, any term of the given determinant may be represented 
in the form 




• • ^ iafi2a2 • • • (^ctj “T 

= (Z|gr|fl2a2 • • • ® lai^2o2 • ■ • 






Collecting together the first summands of these sums (with the same 
signs as the corresponding terms had in the given determinant) 
we evidently obtain an nth-order determinant which differs from 
the given determinant solely in the fact that the ith row has ele- 
ments bj in place of elements Uij. Accordingly, the second summands 
form a determinant in the ith row of which are the elements cj. Thus 


fl|l <^12 • • • 


tf Ij ^12 • • • 


a,i a^ 2 . • • • 

6 j f j 6^ ^2 ' • • n 


62 . • • bfi 


Cj Co ... Cn 

(ln 2 


\ * * * ^/in 


i ^nZ • • • ^nn 


Property 7 is readily extended to the case when any element of 
the ith row is a sum of m summands, not two, m ^ 2. 

We shall say that the ith row of a determinant is a linear combi- 
nation of the remaining rows if for every row with subscript 
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y ^ t 4 - 1 , . . there exists a number kj such 

that when the ;th row is multiplied by kj and then all the rows 
except the ith are added together (addition of rows is to be undei^ 
stood in the sense that the elements of the row are added in each 
column separately), we obtain the ith row. Some of the coefficients 
kj may be zero, that is the ith row will actually be a linear combina- 
tion not of all but only of a few of the remaining rows. In particu- 
lar, if only one of the coefficients kj is different from zero, we get the 
case of proportionality of two rows. Finally, if the row consi^s 
entirely of zeros, it will always be a linear combination of the 
remaining rows— the case when all kj are zero. 

Properly 8 . If one of the rows of a determinant is a linear combi- 
nation of the other rows, then the determinant is zero. 

For example, let the ith row be a linear combination of s other 
rows, 1 ^ s ^ n — 1. Then every element of the ith row will be 
a sum of s summands, and for this reason, using Property 7, we can 
represent our determinant in the form of a sum of determinants in 
each of which the ith row will be proportional to one of the other 
rows. By Property 6 , all these determinants are zero; hence the 
given determinant is zero as well. 

This property is a generalization of Properly 6 and, as will be 
proved in Sec. 10, it provides the most general case of a zero deter- 
minant. 

Property 9. A determinant remains unchanged if to the elements 
of one of its rows we add corresponding elements of another row mul- 
tiplied by the same number. 

Suppose to the ith row of determinant d we add the ;th row, 
} # i, multiplied by the number /»; that is. in the new determinant 
every element of the ith row will be of the form Qig -j- kaj,, s == 
■^1, 2, .... n. Then, by Property 7, this determinant is equal 
to the .'^um of two determinants, the first of which is d and the second 
of which contains two proportional rows and is therefore zero. 

Since the number Ic may also be negative, the determinant does 
not change even if we subtract from one of its rows a row multiplied 
by some number. Generally, a determinant remains unchanged if to 
one of its rows we add any linear combination of the other rows. 

Let us consider an o.vami'tp. A determinant is called skew-symmetric if the 
elements symmetric about tin* principal diagonal difler in sign alone, that 
i-. if for ail i ami / it is tnie that nji — — a, 7 , whence it follows that for all 
i it is Iriio that an — — an -■ 0. Thus, the determinant is of the form 



0 

(Il2 

013 

. . . a,n 


— “ic 

0 

O23 


d = 

- <1,3 

— ^23 

0 

• • • °3a 


» • 

% ♦ % 

% # * 

♦ 9 • ♦ 

... 0 
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Multiplying each row of this determinant by —1, we obtain the transpose of the 
determinant, which is again equal to d, whence, by Property 5, it follows that 

(_l)n d — d 

It then follows, for odd n, that — d = d, or d = 0. Thus any skew-syvxmetric 
determinant of odd order is equal to zero. 


5. Minors and Their Cofactors 

We have already pointed out that it would be difficult to com- 
pute an rtth-order determinant by applying the definition directly, 
that is every time writing out all /i! terms, determining their signs, 
etc. There are simpler methods for evaluating determinants. They 
are based on the fact that a determinant of order n may be expressed 
in terms of a determinant of lower order. For this purpose we intro- 
duce the following notion. 

Let there be a determinant d of order n. Take an integer k whicli 
satisfies the condition 1 < A: < n — 1, and in the determinant d 
choose arbitrary k rows and k columns. The elements which lie at 
the intersection of these rows and columns, that is, which belong 
to one of the chosen rows and to one of the chosen columns will 
obviously form a matrix of order k. The determinant of this matrix 
is called a minor of order k of the determinant d. We can also say 
that the /cth-order minor is a determinant obtained by striking out 
n — k rows and n — k columns in d. In particular, after striking 
out one row and one column in the determinant we obtain a minor 
of {n — l)th order; on the other hand, separate elements of deter- 
minant d will he minors of the first order. 

Let us take a minor M of order A; in a determinant d of order n. 
If we strike out the rows and columns at the intersection of which 
this minor stands, we obtain the minor M' of order (« — k) which 
is called the complementary minor of the minor M. If, on the con- 
trary, we strike out the rows and columns which contain elements of 
the minor M', then what remains is obviously minor M. Thus, wo 
can speak of a pair of complementary minors of the determinant. In 
particular, the element a^ and the minor of order {n — 1) obtained 
by striking out the ith row and the /th column in the determinant 
will form a pair of complementary minors. 

If a /cth-order minor M is located in rows with the position num- 
bers (indices) I’l, • • •» in and in columns with the position num- 
bers /i, /j, .... 7 h, then we use the term cofactor of the minor M 
for the supplementary minor M' taken with a plus or minus sign 
according as the sum of the position numbers of all rows and columns 
in which M is located is even or odd, that is, the sum 

SAf = -r /‘z . . . -!- + /i + !■> + . . . + jh (1) 

In other words, the cofactor of M is the number (— 
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The product of any minor M of order k by its cofactor in a determi- 
nant d is the algebraic sum, whose summands, which are obtained by 
multiplying the terms of the minor M by the terms of the supplementary 

minor M' taken with the sign (— 1)®-'^, are certain term of the determi- 
nant d; their signs in this sum coincide with the signs they have in 
the determinant. 

We begin the proof of this theorem with the case when the minor 
i\f is located in the upper left corner of the determinant: 


ffll . . ♦ ffj/t 

^1, ft+i • • • 

• • • 4 4 4 


4.4 ^kh 


aji+i, 1 • • • h 

h + l • • • n 


M' 

4.4 4 444 

fl j • • • ^nh 

at,, /i+i • . • anti 


that is. in rows with position numbers 1, 2, . . .. k and in columns 
with the same position numbers. Then the minor M' will occupy 
the lower right corner of the determinant. The number fv will then 
be even: 


... + = 2(1 + 2^* ... -^-k) 

therefore, llic minor M' itself will serve as the cofactor of d/. 
Take an arbitrary term 

( 2 ) 




aha 


of minor .1/; its sign in M is ( — 1)* if I is the number of 
inversions in the permutation 

1 2 ... A- 


OCi Q,^ • • 4 

III this minor, the arbitrary term 


(3) 


of minor M' lias the sign (—1)'' where T is the number of 
inversions in tlie permutation 


/k hi A--h2 ... u \ 
ph-i-: ... flnj 

.Mulliplying tlie terms (2) and (4), we obtain 
elements 



a product of n 


a\a.a2'i 


— ak 




n 




located in different rows and different columns of the determinant. 
It is therefore a term of determinant d. The sign of term (6) in the 
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product MM' is a product of the signs of terms (2) and (4), i.e., 
( — 1)^' = (— 1)'+^’. However, term ( 6 ) has the same sign in 
the determinant d as well. Indeed, the lower row of the permutation 

/I 2 ... k k-\- \ k -\-‘l . . , 7} 

\0C| 0^2 • • • CC/j Pa+i Pa+ 2 • ♦ • Pn 

made up of the subscripts of this term contains only / + inver- 
sions. since no a can form an inversion with any one of tlie (5: all a 
do not exceed A*, all p are not less than A + 1- 

This proves the particular case of the theorem that we have 
considered. Let us now lake up the general case. Suppose that the 

minor M lies in the rows with position numbers ij, 4 h 

in the columns with po.sition numbers /i, j 2 , . . /fe. with llie 
condition that 

il <C in Jz jk 

Let us attempt, by interchanging rows and columns of the determi- 
nant, to move the minor M to the upper left corner and let us try 
to do this so that the complementary minor is not changed. For 
this purpose, interchange the /ith row with the (ij — l)th, then 
with the (il — 2 )th and so on until the I’lth row occupies the first 
row; this requires interchanging the rows ij — 1 times. Then we 
successively interchange the uth row with rows located above il 
until it lies directly under the iith row (that is, in the position of 
the original second row); this, as can readily be verified, will require 
interchanging the rows in — 2 times. Similarly, we move the / 3 th 
row to the third row, and so on. until the i,,th row takes up the 
position of the Atli row. In all. we will have to jierform 

(('i - 1) H- (i-i - 2) -I- (/,, - k) 

= ('1 + iz + i)t) — (1 + 2 -f- . . . -i- A) 

Iransposition.s of row.«. 

The minor M is thus located in the first k rows of the new deter- 
minant. Wo will now successively interchange the columns of 
the determinant, the 7 ,th column with all preceding ones, until it 
occupies first place, then the /gth column until it occupies second 
place, and so forth. In all, the columns will be interchanged 

(/, + ■ i- //<) - (1 - 2 -I ... + k) 

t imes. 

All these transformations lead us to a new determinant d' in whicli 
the minor M occupies the upper left corner. Since each time we 
interchanged only adjacent rows or columns, the mutual positions of 
the rows and columns containing the minor M' in the determinant d 
remain without change, and .so the minor M' remains complementary 
to the minor M in the determinant d'\ however, it now occupies the 



46 


CH. 1. SYSTEMS OF LINEAR EQUATIONS. DETERMINANTS 


lower right corner. As was proved above, the product MM' is the 
sum of some number of terms of the determinant d' taken with the 
same signs as they had in d\ However, the determinant d' is obtained 
from the determinant d by means of 

K'l + ^2 + • ■ ■ ^ 

"i" K/i -r /« “T • ■ • "T 7fe) — (1 “T 2 -h . . . + A:)l 

- - 2 (1 + 2 + . . . + A-) 

transpositions of rows and columns, and so, as we know from Sec. 4, 
the terms of determinant d' differ from the corresponding terms of 

determinant d in sign alone, {— 1)*-'^ [naturally, the even number 
2 (1 + 2 + ... k) will not affect the sign). From this it follows 

that the product (—1)*'^ MM' consists of a certain number of terms 
of the determinant d taken with the same signs as they have in that 
deterniiuant. The theorem is proved. 

Note that if the minors M and M' are complementary, then 
the numbers and have the same parity. Indeed, the position 
number of any row and any column enters as a summand in one and 
only one of these numbers, and therefore the sum 5,^ -f- sw is equal 
to the total sum of the po.'^ilion numbers of all rows and columns of 
the determinant, i.e., it is equal to the even number 2 (1 + 2 + 

-h • • • 4- ^0- 


(). Evaluating Determinants 

The results of the preceding section enable us to reduce computing 
an nth-order determinant to the computation of several determi- 
nants of order {n — 1). Let us lirsl introduce notation: if a,;- is an 
tie men I of delerniinanl d. then M^j denotes the complementary minor* 
or. .'>iin[)lv. Ihc minor of that chmrnf. that is. tlie minor of order 
(n - 1) obtained by .striking out the /tli row and the ;th column of 
the tieieriniiiant. Aij will denote the cofaclor of the element a,/; thus. 


.•\.s was proved in the preceding section, the product aijAu is 
the Mini of several terms of the determinant d which enter into this 
.‘•inn willi the same .-jigns ns they have in the determinant d. It is 
ea.'-y to eminl these terms: the iminhor is equal to the number of 
Icnii.s ill the minor Mij, or (n — 1)!. 

Let ii.v now choose any /lli row of the determinant d and take 
llie product of each element of the row by its cofactor: 



.\o term of tlu* detorniinanl (/can be in two different products of 
those given in (1): all the terms of the determinant which enter 
into the product a^A^ contain the element of the ith row and 


6. EVALUATING DETERMINANTS 


47 


for this reason differ from the terms which enter into the product 
diiAi,, that is, those which contain the element ai^ of the ith row, 
and so on. 

On the other hand, the total number of terms of determinant d 
which appear in all the products of (1) is equal to 

(n — 1) ! *n = n! 

Generally, this exhausts all the terms of the determinant d. We have 
thus proved that there is an expansion of the determinant d in terms 
of the ith row: 


d — ajiAii -f + . . • + (“) 

The determinant d is thus equal to the sum of the products of all the 
elements of an arbitrary row by their cofactors. A similar expansion 
of the determinant can also be obtained about any column. 

By replacing the cofactors in the expansion (2) by corresponding 
minors with a plus or a minus sign, we reduce computation of an 
nth-order determinant to the computation of several determinants 
of order (n — i). Note that if some of the elements of the /th row 
are zero, then naturally the corresponding minors need not be 
evaluated. It is therefore useful, first, to transform the determinant, 
using Property 9 (see Sec. 4), so that a large enough number of 
elements in one of the rows or in one of the columns are replaced 
by zeros. Actually, Property 9 enables us to replace all elements, 
except one, by zeros in any row or any column. Indeed, if a^t^ 0, 
then any element f^h, of the tth row will be replaced by 

a zero after subtracting the /cth column multiplied by from 

the /Ih column. Thus, evaluating a determinant of the nth order 
may be reduced to computing a single determinant of order (n — 1). 


Example 1. Evaluate the fourth-order detormiuant 

3 1-12 

-5 1 3 —4 

2 0 1-1 

1 _5 3-3 


Expand it about the third row by using the zero in that row; 

1 -1 2 


d= (_1)3+1.2 


1 


— 5 


3 -4 
3 -3 



3 

-5 

1 2 

1 -4 

+ (-1)3*<.(~1). 

3 1 -1 

-5 1 3 


1 

1 

1 


1 -5 3 


Evaluating the third-order determinants thus obtained, we get 

d = 2 . 16 - 40-}- 48 = 40 
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E.tample 


2. Evaluate the fifth-order determinant 



-2 5 0 -1 3 

1 0 3 7 -2 

3 _1 0 5-5 

2 6-4 1 2 

0-3-1 2 3 


Adding three limes the fifth row to the second and subtracting 
fifth TOW from the fourth row, we get 



-2 5 0 -1 3 

1 _9 0 13 7 

3 -i 0 5-5 

2 18 0—7 —10 

0-3-12 3 


four times the 


Expanding this determinant in terms of the third column, which contains only 
one nonzero element (with the sum of subscripts, 5 + 3, being even), we get 

-2 5-1 3 

1 -9 13 7 

3-1 5-5 

2 18 —7 —10 


We now transform this determinant by adding two times the second row to 
the first row and subtracting three times the second from the third row, and two 
times the second from tlie fourth: 


d 


(I -13 25 17 

I -9 13 7 

0 26 —34 -26 

0 36 -33 -24 


and then expand it in terms of the first column. Noting that the only nonzero 
elemc'nl of this column is associated with an odd sum of subscripts, we get 


-13 25 17 

d - I 26 -34 -26 

30 -33 _24 

I. el IIS compute this third-order determinant after expanding it in terms of the 
(bird row: 




25 


17 

-.3i -26 


- (-33) 


-1.3 17 

26 -26 


+ (-24) 


-13 25 

26 —34 


= 3r,.( _72i - (_33).(-104) + (-24)-(-20S) = -1032 


Example 3. If all the elements of a determinant located on one side of the 
f incipal dinnonal are equal to zero, then the determinant is equal to the product 
■ f the elements on the principal diagonal. 

riiis assertion is obvious for a second-order determinant. 4Ve therefore 
prow it by induction, that is, we assume that for determinants of order (n — 1) 
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it has been proved, and then we consider the nth-order determinant 



ail 

a|2 

aj3 . . 

• ain 


0 

022 

023 • • 

• ozn 


0 

0 

033 • • 

• asn 


« • 

0 

0 

0 . . 



Expanding it in terms of the first column, we get 


(I22 

^23 

. . . Czn 

0 

^33 

. . . asn 

# 

0 

0 

• • • • 

• • • 


But the induction hypothesis is applicable to the minor on the right-hand side: 
it is equal to the product ^22033 • • • <*nn 

d = • • • ^Tjn 

Example 4. The Vandermonde determinant is the determinant 



1 

1 

1 ... 

1 


<*1 

03 

03 ... 


d= 

a? 

al 

al ... 

< 


• • 

of-* 

» « • 

0?-' 

» • • » 

“a 

or> 


Wo shall prove that for any n the Vandermonde determinant is equal to the 
product of all possible differences at — aj, where 1 < ; < i < n. Indeed, for 
n =s 2 we have 

1111 


= fl2 — flj 


Oi <12 


Suppose our assertion has already been proved for Vandermonde determinants 
of order (n — 1). We transform determinant d as follows: subtract from the 
nth (last) row the (n — l)th row multiplied by a,, then from the (n - l)th 
row subtract the (n — 2)th also multiplied by a„ etc. Finally, from the second 
row subtract the first multiplied by ai. Wo obtain 



1 

1 

1 

1 


0 

02— fli 

03 — Oi 

a„— 0, 

(i = 

0 

o| — OiOj 

oj— ojOa 

a^-ojon 


• 

0 

oj"’— OjOj"* 

— ojoJ'* ... 

ofi-^-OiaJJ-a 


Expanding this determinant in terms of the first column, we arrive at a deter- 
minant of order (n — 1); after factoring out common factors from all columns. 


4-986 
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it will lake the form 




1 

t 

... 1 



^2 

“3 

... dfl 

£i = ((i2— a,) (03—01) 

.. (on-oi)- 

4 

4 

... al 



fl?-2 

• • 

... 


The last factor is the Vandermonde determinant of order (n — 1), that is, by 
hypothesis, it is equal to the product of all the differences a,- — aj for 2 < / < 
< i < n. Using the symbol 11 to denote a product, we can write 


11 

2 — «l) 

(«3-ai) ... (^n — 



(«i 

-aj) 

= 11 






A 

}<iSs.n 



j<i^n 

Using the 

same 

method, we 

can 

prov< 

} that 

the determinant 





ft* 

oj-i . 

• » ♦ 





d' = 

1 

« • 

4 

• ♦ 

aj 

... 

4 . 

• « 

• • 

• • 

A 




1 

1 

«i 

a* 

ft# 

«3 ■ 

ft ft 

“n 





1 

I 

1 

ft ft 

I 



is equal to the product of all possible differences as — as, where 1 i < / ^ n 
that is, « y -v. / -s. » 


IJ {ai-aj) 

!• j' n 

Geiiornliziiig the above-obtained expansions of a deteriuiuaiit 
about a row or a column, we prove the following theorem which 
has to do with tfie expansion of a determinant in terms of several rows 
or columns. 

I-apl.ico s theorem. Let lliere be arbitrarily chosen, in a deter- 
minant d of order /c rows (or k columns), 1 ; A: < — 1. Then 

the sum of the products of all A*th-order minors contained in the 
cliosen rows by tlieir cofactors is equal to the determinant d. 

Irool. Supj)ose, in determinant c/, we choose rows with position 
numbers /,, Z., . . .. /T- We know that the product of any minor A/ of 
Older k located in these rows by its cofactor consists of a certain 
number of terms of the determinant d taken with the signs they 
ha\(‘ in I In* determinant. 1 he theorem will consequently bo proved 
if \se demonstrate tiiat l)y making .17 run through all Ath-order 
minors located in the cliosen rows we obtain all the terms of the 
delrrminant . none lieing rejiealod. 

Lot 


^lai^2c£2 • • • (3) 

be an arbitrary term of the determinant d. We separately take 
the product of those elements of the term which belong to the rows 
we have chosen with position numbers /„ This is the 
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product 


Oi 


uaii^‘2'Xi2 


a 


Wk 




V' \ 

The k factors of this product lie in A* distinct <;oluinTis,,.tifamely. 
in the columns with position numbers a^j, a,- 2 , • Those 

position numbers of the columns are consequently determined by 
specifying the term (3). If by M we denote the Ath-order minor 
lying at the intersection of the columns with these position numbers 
a;,, a< 2 , • • and of the earlier chosen rows with the position 

numbers /j, • • •’ 'ft' product (4) is one of the terms 

of the minor A/, and the product of all the elements of the term (3) 
not in (4) is a term of its complementary minor. Thus, any term 
of the determinant enters into the product of a certain (quite definite) 
minor of order A made up of the chosen rows multiplied by its comple- 
mentary minor, and is a product of quite definite terms of lhe<5e 
two minors. Finally, in order to obtain the term that we took of 
the determinant with the sign which it has in the determinant, 
it remains, as wc know, to replace (he complementary minor by t!io 
cofactor. This completes the proof of the theorem. 

It is po.«sible to give a .^lightly different proof, namely, 
the product of any Ath-order minor M located in the chosen row.s 
by its cofactor consists of A! (n — A)! terms, since the Ath-order 
minor M consists of A! terms and its cofaclor. differing po.‘^sibly 
from the minor of order n — A in sign alone, contains (n — A)! 
terms. On the other hand, the numberof Ath-order minors contained 
in the chosen rows is equal to the number of combinations of n taken 
A at a time, that is, it is equal to the number 


n! 


A! (/I — A)! 


Multifdying out. we find that the .«um of the products of all 
Ath-order minors of the cho.«en row.s by (heir cofaclors consists 
of n! summands. Such, however, is (he total miinher of terms of tlio 
determinant d. The theorem will thus be proved if we demonstrate 
that any term of the determinant d appears at least once (and, 
in that case, exactly once) in the .sum at hand of the prodiicls 
of the minors hy their cofactors. It is left to the reader to repeal 
(\vith slight smiplificalions) the reasoning given in the first proof. 

The Laplace theorem enables one to reduce the computation 
of an nth-order determinant to the computation of several deter- 
minants of orders A and n — A. Generally speaking, there arc very 
large number of such new determinants and .so it is advisable to 
apply the Laplace theorem only when it is possible to choo.se A rows 
(or columns) in the determinant so that many of the Alh-order 
minors located in these rows are zero. 


52 


CH. 1. SYSTEMS OF LINEAR EQUATIONS. DETERMINANTS 


Example 1. Suppose we have a determinant, all elements of which in the 
first k rows and the last n — k columns are zero: 


d= 


flu ... 

Gift 

« 4 • * • 

flftl ... 

♦ » 


0 




«nl 


^11 


^nk 






“hft 


^h+t, A+1 • • 

• ®A+i. n 

®n. h+1 • • 

# * ^ ^ 

■ ®nn 

! product of two of 

'A+l.ft+l ••• 

°A+ 1 , n 

••• 

« ft • 

<*nn 


To prove this, it suffices to expand the determinant about the first k rows. 
Example 2. Suppose wc have a determinant d of order 2n, in the upper 
left corner of which is an nth-order minor composed entirely of zeros. If the 
nlh-ordor minors lying in the upper right and lower left and lower right corners 
of the determinant are denoted, respectively, by .1/, A/' and M", so that 

the delerrainaiU may be written symbolically as d — 


.\r M 


then d — 


= (- 

To prove this, e.xpand the determinant in terms of the fir55t n rows and 
note that 

•^.\i = (1 H' 2 • • • + n) + l(n -h 1) -I" (n + 21 + • • - + 2n) = n -f- 2n^ 

that is, Syj and n are of the same parity. 

Example 3. Evaluate the determinant 

12-2 1 

0 3 0 1 -5 

2-31-3 1 

-1 - 13-1 0 

0 10 2 5 

Expanding it about the first and third columns which contain nicely located 
zeros, we get 

3 1 -5 


d = 




-1 2 
2 1 


— 1 -1 
1 2 


0 

5 


-r (— l)i+-i-^»+3 




-4 2 
-1 3 

2 1 
-1 3 


3 1 -5 

-3 -3 1 

4 2 5 

1 -2 1 

3 1 —5 

4 2 5 

= (-8) *{-20) — (-10). (-62) - 7'87 = -1069 
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7. Cramer’s Rule 

The foregoing theory of determinants of order n allows us to show 
that these determinants, which were introduced only by analogy 
with second- and third-order determinants, may, like the latter, 
be used to solve systems of linear equations. Let us first inake one 
additional remark regarding expansions of determinants in terms 
of a row or a column; this remark will often come in handy m the 

sequel. 

Expand the determinant 

I fl 1 j ... (I ... ^ 1 « I 



I j ... Ofxj • • • ^tin \ 

about the yth column: 

d = (iijAij -f- ~ • *r OnjAnj 

Then, in this expansion, replace the elements of Ihe/th column by 
a set of n arbitrary numbers bi, b^. . . ., The expression 

b\Aij “h bnAnj T . . . "T b,iAfij 

which you ohliiin will obviously servo as an expansion about the 
;Th column for the determinant 


Oil 

. . . 

... 0\fi 


... 'V 

• • • 

« » 

Onl 

... hfi 

« « ♦ » 

0 

... Oftn 


which is obtained from the determinant d by replacing its /th column 

by a column of the numbers b„ b,_ b„. ndeed replacing the 

jth column of d does not affect the minors of the elements of the 
column, and for this reason does not aHect their cofactors. 

Let US apply this to the case when for tiiG numbers Oj, . . .• 

we take elements of the *th column of the determinant d when 
k^i The determinant' resulting from such a replacement will 
contain two identical columns O'll' and and therefore will be 
zero. Hence, the expansion of this determinant about its /th column 

will also be zero, that is 

auiAxi + a-ihAi) + . . - + finhAnj = 0 for / =^k 

Thus, the sum of the products of all elements of a certain column 
of a determinant by the cofactors of the corresponding elements of 
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another column is zero. The same result of course holds true for the 
rows of a determiuant. 

Let us now examine systems of linear equations; we will confine 
ourselves for the time being to systems in which the number of equations 
is equal to the number of unknowns, i.e., systems of the form 

~r b{, 

^22'^ 2 ®2n^n ^ ^2’ 


I • • • I 



Wc also as-sume that the determinant d made up of the coeffi- 
cients of the unknowns of the system (called, for short, the deter- 
minant of the system) is nonzero. Given these assumptions, we will 
prove that the system (1) is consistent and even determinate. 

In Sec. 2, when we solved a system of three equations in three 
unknowns, we multiplied each of the equations by a factor, and 
tlu ‘11 added the equations; the coefficients of two of the unknowns 
proved to he zero. We now see immediately that the factors which 
we used were cofactors, in the determinant of the system, of the 
element which was the coefficient of the desired unknown in the 
given equation. We now use this device to solve system (1). 

First suppose that system (1) is consistent and ai. a.,, a„ 
is one of its solutions. Hence, the following equations hold true; 


(i.iOLi -r OjoCtj 





b 

b 









n 



Let j be any one of the numbers 1,2... n. Multiply both sides 

of th(‘ first equation of (2) by that is, by the cofaclor of the 

eh'ment a^j m the determinant d of the system. Multiply both 

sides of the second equation by A.j. and so on. Finally, multiply 

btiili sides of the last equation by A^j. Adding together separately 

the left and right sides of all equations, we arrive at the following 
e([iialit)n: 



H- nniA,,j) etj 








•«« 

‘r . . . ~ <Xj 


• • • • ^hinAfij) CCfi 

= biA^j - 4 - b^A.^j . 
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The coefficient of aj in this equation is d, the coefficients of 
all other a will, d'Je to the remark made above, be zero, and the 
constant term will be the determinant obtained from the determinant 
d after replacing the ;th column in it by a column of the constant 
terms of system (1). If, as in Sec. 2, we denote this latter determinant 
by dj, then our equation takes the form 

daj = dj 

whence, because d :^0, 

dj 


This proves that if system 
unique solution 



(1) is consistent, then it possesses tlie 



We will now show that the set (3) of numbers actually satis- 
fies system (1) of equations, that is, that (!) is consistent. We will 
make use of the following commonly employed symbolism. 

Any sum of the form Oi -}- fls + • • • + he denoted 

n 

briefly by 2 consider a sum whose terras are labelled 

with two subscripts, and i = 1, 2, . . n, / = 1, 2, . . m, 
then we can first take the sums of the elements with fi.xed first 

m 

subscript, that is, the sums where i = 1, 2, . . n, and 

then add all the sums. We then obtain the following notation for 
the sum of all elements au'. 

n m 

i=l 3=1 

However, we could first add the summands with fi.xed second 
subscript and then combine the resulting sums. Thus 

n m n 

IJsl j-=l j=J t=l 


i.e., in a double sum the order of summation may be reversed. 

Now put the values of the unknowns (3) into the ilh equation 
of system (1). Since the left side of the ith equation may be written 

n " 

as ^aijXj and since dj = ^bkAhj, we get 

h=l 



n 


n 


=1 2 (S 


j-i 


/t»l 



n n 


2 ( 2 
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I 

With regard to these manipulations, note that the number -j turned 


out to be a common factor in all summands and was therefore taken 
outside the summation sign; besides, after changing the order of 
summation, the factor 6^ was factored out of the inner sum since 
it is not dependent on the subscript / of the inner summation. 

n 

We know that the expression 2 

i=i 

. . . + a^nAkn will be equal to d for ^ = i and to 0 for all other 
k's. Thus, in our outer sum with respect to k there will be only 
one summand left, namely, i.e., 


This is proof that the set (3) of numbers is indeed a solution to the 
system (1) of equations. 

We have obtained the following important result. 

A system of n linear equations in n unknowns, the determinant 
of which is nonzero, has a unique solution. This solution is obtained 
from formulas (3), that is by means of Cramer's rale. The formula- 
tion of this rule is the same as in the case of a system of two equa- 
tions (see Sec. 2). 


Example. Solve the system of linear equations 

2i-, 4- — 5 j* 3 a-i S, 

xj — 31; — = y, 

2x: — X3 -\- 2xi = -5, 
xj -f- -'ll- — "Js 4-6^4= U 


The 


determinant of the system Is different from zero: 



2 1-51 

1 -3 0 -6 

0 2-1 2 
1 -7 6 



and so Cramer s rule is applicable. The values of the unknowns will have as 
lunneralors tlie determinant? 




8 

1 

— 5 

1 


1 



8 -5 

1 

<1, - 


9 

-3 

0 

-6 

- 81, 


] 

1 

9 0 - 

-C 


-0 

2 

-1 

■1 

^ » 1 

0 - 

-5 -1 

2 



0 

■1 

— 7 

a 



1 

1 

o 

G 



1 '> 

1 - 

1 

8 

1 

1 



•> 

1 -5 




I 

-3 

9 

~6 




1 

-3 0 

c 

do = 

— 






= 


V 



0 


— .5 

2 

9 



0 

2 -1 

c 

w 



1 

1 

•J 

0 

{] 




1 

4 -7 

c 


= - iOS, 


= 27 


7. CRAMER’S RULE 


57 


Thus, 



will be the unique solution set of our system. 



We did not consider the case when the determinant of a system 
of n linear equations in n unknowns (1) is zero. It will be discussed 
in Chapter 2, where it will find its place in the general theory of 
systems involving any number of equations in any number of 
unknowns. 

One more remark is in order with respect to systems of n linear 
equations in n unknowns. Given a system of n homogeneous linear 
equations in n unknowns (see Sec. 1): 

+ a 

~ QnnX‘i • • " “T ■“ 


~ • • • • ' 

« 

In this case, all determinants dj, / = 1, 2, . . ., n, contain 
a column made up of zeros and are therefore equal to zero. Thus, 
if the determinant of system (4) is nonzero, that is if Cramer’s rule 
is applicable, then the only solution of system (4) will be the trivial 
solution 

X, = 0, X2=0, .... Xr, = 0 (5) 



Whence follows the result: 

If a system of n homogeneous linear equations in n unknowns has 
nontrivial solutions, then the determinant of the system is necessarily 
zero. 

In Sec. 12 it will also be shown that, conversely, if the determi- 
nant of such a system is indeed equal to zero, then the sptem will 
have solutions other than the trivial solution, the existence of 
which is obvious for every system of homogeneous equations. 


Example. For what values of * can the system of equations 

Alj + X2= 0 , \ 

Xf -i- fcx2 =0 / 

have nontrivial solutions? 

The determinant of this system 


k 1 
1 k 



will be zero only when k — ±1. It is easy to see that for each one of these 
two values of k the given system will indeed have nontrivial solutions. 

The significance of Cramer’s rule lies mainly in the fact that 
for cases when it is applicable it offers an explicit expression of 
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the solution of the system in terms of the coefficients of the system. 
However, Cramer’s rule involves very unwieldy computations; in 
the case of a system of n linear equations in n unknowns, one has 
to compute w + 1 determinants of the nth order. The method of 
successive elimination of unknowns given in Sec. 1 is much more 
convenient in this respect since the computations involved here 
are actually equivalent to those required in the evaluation of a single 
determinant of the nth order. 

In applications, we often encounter systems of linear equations 
whose coefficients and constant terms are real numbers obtained 
in measurements of physical quantities and as such are known only 
approximately, to within a specified accuracy. The foregoing methods 
are then sometimes rather inconvenient because they lead to results 
with poor accuracy. A. variety of iterative procedures have taken 
their place. These are methods which yield solutions of systems 
of equations via successive approximations of the unknowns. The 
interested reader will find such methods described in texts dealing 
with the theory of approximate calculations. 


CHAPTER 2 


SYSTEMS OF LINEAR 
EQUATIONS 
(GENERAL THEORY) 


8. n-DimensioDal Vector Space 

To construct a general tlicory of systems of linear equations 
we will need more than the apparatus that sufficed with such success 
in the solution of systems to which Cramer’s rule was applicable. 
Besides determinants and matrices we will need a new concept, 
which, perhaps, is of still greater general mathematical interest— 
that of multidimensional vector spaces. 

First a few preliminary remarks. From the course of analytic 
geometry we know that any point in a plane is determined (for 
specified coordinate axes) by its two coordinates, which is to say, 
by an ordered set of two real numbers. Any vector in a plane is 
determined by its two components, which again is an ordered set 
of two real numbers. Similarly, a point in three-dimensional space 
is determined by three coordinates, a vector in space, by three 
components. 

In geometry and also in mechanics and physics we often encoun- 
ter objects whose specification requires more than three real numbers. 
For instance, let us consider a collection of spheres in three-dimen- 
sional space. To specify a sphere completely we need the coordinates 
of its centre and the radius; this amounts to an ordered set of four 
real numbers, of which, incidentally, the radius can only assume 
positive values. On the other hand, lot us consider various positions 
of a solid in space. The position of a solid will be fully defined if 
we indicate tbe coordinates of its centre of gravity (this requires 
three real numbers), the direction of some fixed axis passing through 
the centre of gravity (two numbers— two out of three direction 
cosines), and, finally, the angle of rotation about this axis. Thus, 
the position of a solid body in space is determined by an ordered 
set of six real numbers. 

These examples suggest considering collections of all possible 
ordered sets of n real numbers. After introducing the operation.'^ 
of addition and multiplication by a scalar (this will be done later 
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on by analogy with appropriate operations involving vectors in 
three-dimensional space expressed in terms of components), we call 
this collection an n-dimensional vector space. Thus, ra-dimensional 
space is only an algebraic structure which retains certain of the 
simplest properties of collections of vectors of three-dimensional 
space emanating from a coordinate origin. 

An ordered set of n numbers {an ordered n-tuple) 

CC = (^li • • *1 (0 

is called an n-dimensional vector. The numbers Oj, / = 1, 2, . . n, 
will be called the components of the vector a. The vectors a and 

P = {bi, i»2- ■ • bji) (2) 

will be considered equal if their components, in the same places, 
coincide, that is, if ai — 6,-. i = i, 2, . . n. Lower-case Greek 
letters will be used to denote vectors and lower-case Latin letters to 
denote scalars. 

Examples of vectors are: (1) Vector segments (directed line- 
segments) emanating from the coordinate origin in a plane or in 
three-dimensional space will, given a fixed system of coordinates, 
be two- and three-dimensional vectors in the meaning of the definition 
given above. (2) The coefficients of a linear equation in >i unknowns 
constitute an /i-dimensional vector. (3) Any solution’ of a system 
of linear equations in n unknowns is an n-dimensional vector. 
(4) If an s by n matrix is given (s rows and n columns), then its 
rows are «-dimensional vectors, its columns, s-dimensional vcctons. 
(fi) The s by n matrix itself can be regarded as an 5n-dimcnsionai 
vector: all we need to do is read the elements of the matrix one 
after the other, row by row; in particular, any square matrix of 
order n may be regarded as an n'-dimensional vector, and it is 
quite obvious that any nMimensional vector may be obtained 
in tiiis way from a matrix of order n. 

The sum of vectors (1) and (2) is the vector 

a "T P = (fli + + 6o, . . On -h bn) (3) 

whose components are sums of the corresponding components of the 
vector.'^ being added. Addition of vectors is commutative and associa- 
tive because of the commutativity and a.ssocialivity of the addition 
of numbers. 

The role of zero is played by the zero vector: 

0 - (0, 0. . . 0) (4) 

Indeed, 

a -i- 0 = (fli -f 0, flo “ 0, . . ., fln -r 0) 

= (ffi, a^, . . ., o,i) = a 
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We use the same symbol 0 for the zero vector as for the number 0. 
There is never any difficulty in deciding whether it is the number 
zero or the zero vector we are talking about at any time. However, 
from now on the reader should bear in mind the possibility of diffe- 
rent interpretations of the symbol 0. 

We use the term opposite vecto^ (negative) of the vector (l)for 
the vector 

— CC = { — fli* — ^2’ • • •’ (5) 

It is obvious that a + (—a) = 0. It is now easy to see that for 
the addition of vectors there is an inverse operation— subtraction: 
the difference between the vectors (1) and (2) is the vector a — p = 
= a + (— P), or 

tX — P ” (^I “ ^1’ ^2 ^2* • • *1 (6) 

The addition of n-dimensional vectors defined by formula (3) 
arose out of the geometric addition of vectors in the plane or in 
three-dimensional space performed by the parallelogram rule. In 
geometry we have to do with the multiplication of a vector by a real 
number (“scalar”): the multiplication of a vector a by a scalar k 
signifies, for A: > 0, a stretching of a by a factor k (it is compression 
if Ic < 1), and for A: < 0 a stretching by a factor | k \ and reversal 
of direction. Expressing this rule in terms of the components of the 
vector a and passing to the general case at hand, we obtain the 
following definition. 

The product of a vector (1) by a scalar k is the vector 

koL — Oik ~ ^'®2’ * • (7) 

whose components are equal to the product of the corresponding 

components of the vector a by k. 

From this definition there follow important properties which 


may be verified by the reader: 

k (a ± P) = ka ± Af, (8) 

(A ± I) a = ka ± la, (9) 

A {la) = {Id) a, (10) 

1 -a = a (11) 

The following properties are just as easy to verify but they may 
also be obtained as corollaries to Properties (8)-(ll): 

0-a = 0, (12) 

(— l)-a=— a, (13) 

A-0 = 0, (14) 

if ka = 0, then either A = 0, or a = 0. (15) 
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The collection of all n-dimensional vectors with real components 
regarded in conjunction with the operations of addition of vectors 
and multiplication of a vector by a scalar is called an n~dimensional 
vector space. 

Note that the definition of an n-dimensional vector space does 
not include multiplication of a vector by a vector. It would be 
easy to define multiplication of vectors— assume, say, that the 
components of a product of vectors are equal to the products of the 
corresponding components of the factors. However, such multiplica- 
tion would not find any serious applications. Thus, vector segments 
emanating from a coordinate origin in the plane or in three-dimen- 
sional space constitute (for a fixed system of coordinates) a two- 
dimensional and, respectively, a three-dimensional vector space. 
The addition of vectors and the multiplication of a vector by a scalar 
are, as we have pointed out above, geometrically important, whereas 
it is impossible to give any reasonable geometrical interpretation 
to the componentwise multiplication of vectors. 

Let us consider another example. The left side of a linear equation 
in n unknowns, that is, an expression of the form 

/ — ^1^1 “1“ 02^2 "h 

is called a linear form in the unknowns xi, Xa, . • Xn- The linear 
form / is obviously defined completely by the vector (Oi, a^, . . </„) 

of its coefficients; conversely, any /?-dimensionai vector uniquely 
determines some linear form. The addition of vectors and the multi- 
plication of a vector by a scalar become corresponding operations 
involving linear forms; these operations were extensively used 
in Sec. 1. Componentwise multiplication of vectors in this instance 
is meaningless. 

9. Linear Dependence of Vectors 

A vector p of n-dimensional vector space is proportional 1o 
vector a if there exist.s a number k such that p = ka (see formula (7) 
of the preceding section]. In particular, the zero vector is propor- 
tional to any vector a due to the equality 0 = 0-a. But if p = ka 
and p = 7 ^ 0, whence k ^ 0, then a = /r"^p, that is, for nonzero 
vectors, proportionality po.«:sesses the property of symmetry. 

A generalization of the concept of proportionality of vectors 
is the following concept which we have already (in the case of rows 
in a matrix) encountered in Sec. 4; a vector p is called a linear 
combination of the vectors aj, ««, . . ., if there exist numbers 
/i. /s such that 

p — /la, - l.,a.. ~ ~ /.sCtj 



9. LINEAR DEPENDENCE OF VECTORS 


63 


Thus the /th component of the vector p, / = 1, 2, . . /z, is equal 

(because of the definition of a sum of vectors and a product of a vector 
by a scalar) to the sum of the products of the yth components of 
the vectors ai, ao, . . by /j, I 2 , ... , h, respectively. 

A system of vectors 

tti, a^, . . cir-u otr (r > 2) (!) 

is linearly dependent if at least one of the vectors is a linear combi- 
nation of the remaining vectors of the system; it is called linearly 
independent otherwise. 

We give another form of this e.xtremely important definition: 
a system of vectors (1) is linearly dependent if there exist numbers 
kx, /cg, . . kr, at least one of which is nonzero, such that the equaiioii 

A*,ai + A'jCts -r /I'rCCr = 0 (2) 

holds true. 

Proof of the equivalence of these two definitions is not difficult. 
For example, let the vector of system (1) be a linear combination 
of the remaining vectors: 

Cif = Aetj ~h l-i'^2 • • • “T If-iCt-r-i 

From this there follows the equation 

ll^i " 1 “ • • • ‘r A-j^r-l — Cir — 0 


which is like (2), where A*,- = fo** / = 1. 2 r — 1 and 

kr = —1 that is kr =^0. Conver.iely. let the vectors (1) be connected 
by the relation (2) in which, say, Av =?^=0. Then 







Vector ar has proved to be a linear combination of the vectors 
®2’ • • • 1 


Example. The system of vectors 

a, = (5, 2. 1), az = (-1, 3, 3). ota = (9. 7, 5), a* = (3, 8, 7} 
is linearly dependent, since the vectors are connected by the relation 

'i'Xi — 02 — 303 d* 2at = 0 

In this relation all the coefficients are different from zero. However, there are 
other linear dependences between tlie vectors, dependences in which some of 
the coefficients are zero, for instance 

2ai + 02 — 03 = 0, 3 o 2 -r <*3 — 20* = 0 

The latter definition of a linear dependence given above is also 
applicable to the case of r 1, that is, to the case of a system 
consisting of one vector a: this system is linearly dependent if and 
only if a = 0. Indeed, if a = 0, then, say, for ^ = 1 we will have 
ka s= 0. Conversely, if A:a = 0 and k ^ 0, then a = 0. 
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Note the following property of the concept of linear dependence. 

If some subsystem of the system (1) of vectors is linearly dependent, 
then the whole system (1) is linearly dependent. 

Indeed, let the vectors aj, ctj, of system (1), where 

s <C r, be connected by the relation 

fejtti + + ■ • • + hga, — 0 

in which not all coefficients are zero. It then follows that the relation 

/cjai + A'jtta + . . . -r hgag -f O-ois+i + . . . -r O-a^ — 0 

or system (1) is linearly dependent. 

From this property follows the linear dependence of any system 
of vectors containing two equal or, generally, two proportional 
vectors and also of any system containing the zero vector. The 
property we have just proved can also be stated as follows: if a system 
(1) of vectors is linearly independent, then any subsystem of (1) is also 
linearly independent. 

The question arises as to how many vectors a linearly indepen- 
dent system of n-dimensional vectors can contain and, in particular, 
whether there e.visl systems with an arbitrarily large number of 
vectors. To answer this question, let us consider the following 
vectors in an n-dimensioual vector space: 

ei -- (1, 0, 0, . . 0), 

e., = (0. 1. 0, . . 0), 


En = (0, 0, 0. . . 1) 

They are called unit vectors of that space. The system of unit vectors 
will be linearly independent: let 

A'lfi -r A'ofj + . . • “h AnCn = 0 

Since the left .side of this equation is equal to the vector (Ai, A®, . . • 

. . .. An), it follows that 

(A’j. lin. . . An ) “■ 0 

or Aj = 0, i 1, 2, . . n, since all the components of the zero 
vector are zero, and equality of vectors is equivalent to equality 
of their corresponding components. 

Thus, in n-dimensional vector space we have found one linearly 
independent system consisting of n vectors. The reader will^ learn 
later on that there actually exist an infinite number of distinct 
systems of that kind in this space. 

On the other hand, let us prove the following theorem. 

For s>^ n, any s vectors of an n-dimensional vector space constitute 
a linearly dependent system. 
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Let there be given the vectors 

ftj — (^ll^ ^12* • ■ •’ ®ln)' 
ffj — ^22’ • • •’ ^2/i)* 



We have to choose scalars A*,, k. A^, not all zero, such that 

Altti + k2CL2 + . . . + A'^CTs = 0 (4) 

Passing from (4) to the corresponding equations between the compo- 
nents, we get 

fluAj + dzik-i -r . . • -r UjiA*, = 0, 

^^i2A*i o^ik^ -j- . . . T Og^ks = 0, 

# « ♦ ♦ • 

Oinki + a27iA‘2 + ■ ■ ■ + ^shAj = 0 

However, equations (5) constitute a system of n homogeneous 
linear equations in s unknowns ki, kn, . . kg. The number of 
equations in this system is less than the number of unknowns, and 
therefore, as proved at the end of Sec. 1, the system has nontrivial 
solutions. We can thus choose scalars k\, k^^ • • •» kg (not all 
zero) which will satisfy requirement (4). The theorem is proved. 
Let us call a linearly independent system of n-dimensional 

vectors 

CCj, • • •» (1^) 



a maximal linearly independent system if by adjoining to this 
system any n-dimensional vector p we obtain a linearly dependent 
system. Since in every linear dependence relating the vectors 
a a. . . ccp, pi the coefficient of p must be nonzero— otherwise 
sykem (0) would be linearly dependent— it follows that the vector 
p is expressed linearly in terms of the vectors (6). Therefore the 
system (0) of vectors is a maximal linearly independent system if 
and only if the vectors (6) are linearly independent and any n-diineii- 
sional vector p is a linear combination of them. 

From the results obtained above it follows that in an n-dimensional 
space any linearly independent system consisting of n vectors is maximal 
and also that any maximal linearly independent system of vectors of 
this space consists of at most n vectors. 

Every linearly independent system of n-dimensional vectors is 
contained in at least one maximal linearly independent system. Indeed, 
if a given system of vectors is not maximal, then one vector may 
be added to it so that the resulting system remains linearly inde- 
pendent. If this new system is still not maximal, then another vector 
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may be added to it, and so on. However, this process cannot continue 
endlessly because every system of w-dimensional vectors consisting- 
of + 1 vectors is linearly dependent. 

Since every system consisting of one nonzero vector is linearly 
independent, we find that any nonzero vector is contained in some 
maximal linearly independent system, and for this reason there are 
infinitely many different maximal linearly independent systems of 
vectors in an n-dimensional vector space. 

The question arises: do there exist, in this space, maximal 
linearly independent systems with a smaller number of vectors 
than n or is the number of vectors in any such system invariably 
equal to n? The answer to this important question will be given 
below after a few preliminary investigations. 

If vector p is a linear combination of the vectors 

a,, a.,, . . ar (7) 

it is often said tliat P is expressed linearly in terms of system (7). 
Naturally, if vector p is linearly expressed in terms of some subsystem 
of this sy.stcm, then it will be linearly expressed in terms of (7) 
as well— it would be .sufficient to take the remaining vectors of the 
system with coefficients equal to zero. Generalizing this terminology, 
we say that the system of vectors 

P,. P, p, (8) 

isexpressed linearly in terms of system (7) if every vector p^, i = 1. 2, 

. . ., 5, is a linear comlnnation of the vectors of (7). 

We prove the transitivity of this concept: if system (8) is expressed 
linearly in Urms of (7), and the system of vectors 

Vi- \t (9) 

is expressed linearly in terms of (8), then (9) is expressed linearly in 
terms nf (7) as irell. 

Ituieeil. 



but Pj • ^ 

•a— 1 

into (iO), we gel 




. .., s. Substituting these expressions 




in -I 


r 5 

m— 1 i=i 


111 other words, every vector yj, j = 1, 2, 
hination of vectors of system (7). 


• • • t 


t, is a linear com- 
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Two systems of vectors are termed equivalent if each one of them 
can be expressed linearly in terms of the other. From the proof 
given here of the transitivity of the property of systems of vectors 
to be expressed linearly in terms of each other there follows the 
transitivity of the concept of equivalence of systems of vectors and 
also the following assertion: if two systems of vectors are equivalent 
and if some vector is expressed linearly in terms of one of these systems, 
then it will be expressed linearly in terms of the other too. 

One cannot assert that if one of two equivalent systems of vectors 
is linearly independent, then the other system also possesses this 
property. But if both systems are linearly independent, then an 
important statement can be made with respect to the number of 
vectors entering into them. First let us prove the following theorem 
which, because of the role it will play in the future, it will be con- 
venient to term a fundamental tlieorem. 

If in an n-dimensional vector space we have two systems of vectors: 

(I) etj, C42 Ctf’ 

(II) |i,. Ps P,. 


the first being linearly independent and expressible linearly in terms 
of the second, then the number of vectors in the first system does not 
exceed that in the second system, or r ^ s. 

Let r>s. By hypothesis, each vector of system (1) can beexpres.sed 
linearly in terms of system (II): 

ttj — <^tlPl ^I2pi ... ^^JsP«' 

OC 2 ” ^;ipi ^2'jp2 ... ^a'P.*’ 

CCf = ^rlPl ^r2p2 • • • ^r.sPs 

The coefficicnls of thc.se linear e.xpres.'iions constitute a system of 
r 5-dimensional vectors: 

Yt = «1 k)- 



Yf — (^rl’ • ■ •• ^rs) 

Since r > 5, these vectors are linearly dependent, that i.s, 

A'iYi + 4- ... -I- ^rYr = 0 


where not all coefficients k^, 
at certain equations between 

(»j 


. . .. kf are zero, 
the components: 

/ - 1 

J ^ if —I « • • ) « 


Whence 


we arrive 

( 12 ) 


5* 
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Let US now consider the following linear combination of vectors 
of system (1): 

~r • • • 

T 

or, more compactly, 2 Utilizing (11) and (12), we get 

i=l 


S 

i=l 


r 


V 

t 

i=l 


( S = 

j=l 


8 r 



j=l t=l 




But this runs counter to the linear independence of system (1). 

From the fundamental theorem just proved we have the following 
result. 

Any two equivalent linearly independent systems of vectors contain 
an equal number of vectors. 

Any two maximal linearly independent systems of n-diraensional 
vectors are evidently equivalent. They therefore consist of one and 
the same number of vectors, and since (as we know) there exist 
systems of that kind consisting of n vectors, we finally get the answer 
{o tlie earlier posed que.'Jtion: every maximal linearly independent 
systrm of vectors of an n-dimensional vector space consists of n vectors. 

Some corollaries follow. 

If in a yivcn linearly dependent system of vectors we take two 
majimal linearly independent subsystems, that is, subsystems to 
u'liich no vector of our system can be adjoined without spoiling the linear 
independence, then these subsystems contain an equal number of vectors. 

Indeed, if in the system of vectors 

tti, a2, . . ., (Zr (13) 

till' sulisvsleni 

a,, cij tta, s < r (14) 


is a maximal linearly independent .subsystem, then any one of the 

vcctcu's as 4 .i i-'^ expressible linearly in terms of system (14). 

( III tile otluT hand, any vector a,- of system (13) is linearly expressible 
in terms of tliis system: it is only necessary to take the coefiicient 
1 for tlie vector a-,, and the coefiicient 0 for all the other vectors. 
It is now easy to s(‘e that .systems (13) and (14) are equivalent. From 
Ihi.s it follows that (13) is equivalent to any one of its maximal 
linearly independent .‘subsystems, and therefore all the sub.'iysteins are 
equivalent; i.e.. being linearly independent, they contain the same 
number of vectors each. 

The number of vectors in any maximal linearly independent 
subsystem of a given system of vectors is termed the rank of the 
system. Taking advantage of tliis concept, we derive yet another 
.corollary from the fundamental theorem. 
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Suppose there are two systems of n-dimensional vectors : 


CC 1 1 ©2’ • 


• • s 


a 


and 


P„ P 


(15) 

(16) 


which are not necessarily linearly independent- the rank of system {\b) 
is equal to the number k, the rank of system (16), to the number I If 
the first system is expressed linearly in terms ^ the second, then k ^ 1. 
But if these systems are equivalent, then k — 1. 

In fact, let 

(18) 


CCi2 


'h 


and 


P;i’ P>2 Pji 


be resnectivelv. any maximal linearly independent subsystems 
of\l5) and (16) Then systems (15) and (17) are equivalent and 
Ao^'sle holds’ t.ie for '(10) and (18). From the fact that ( ts 

linearly expressible in terms of (lb) H now follows that (Ip *=> 
also linearly expressible in terms of (16) and Uiere ore in terms of the 
eauhXnt system (18). It then remains, utilizing the linear indepen- 
d^e orsysYemTlT^ to apply the fundamental tl-orem The ^ 
assertion of the corollary being proved follows directly from the 

first. 

10. Rank of a Matrix 

If we are given a system of n-dimensional vectors, it is natural 
to ask whetLr this system of vectors is linearly dependent or not. 
OnrcaiTnot hope to find that in every specific instance the question 
wUl be Resolved without difficulty: a superficial examination of the 

system of vectors 

a = (2, -5. 1. -1). P = (*■ ^ 

fails to reveal any linear dependences in it. though in reality these 
vectors are connected by tlic rela ion 

7„ _ .3p + llv = 0 

One wav of settling this issue is given in Sec. 1. Since the com- 
ponents of the given vectors are known we consider as unknown 
he coefficients of the desired linear dependence and obtain a system 
of homogeneous linear equations, which we solve by the Gaussian 
method. In this section we .suggest a different approach, which will 
also bring us closer to our principal objective— the solution of arbitra- 

ry systems of linear equations. 
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Suppose we iiave an s by fi matrix rows and n columns) 



the numbers s and n not being related in any way. Regarded as 
s-dimensional vectors, the columns of this matrix may, generally 
speaking, be linearly dependent. The rank of the system of columns, 
that is the maximal number of linearly independent columns of 
matrix A (more precisely, the number of columns in any maximal 
linearly independent subsystem of the system of columns) is called 
the rank of the matrix. 

Naturally, in the same way the rows of matrix A may be regarded 
as /i-dimensional vectors. It appears that the rank of the system 
of rows of the matrix is equal to the rank of the system of its columns, 
that is, it is equal to the rank of the matrix. The proof of this extre- 
mely unexpected assertion will bo obtained after we point out 
yet another way of defining the rank of a matrix (which at the same 
time indicates a practical method of evaluation). 

Let us first generalize the concept of a minor to the case of rectan- 
gular imitrices. In matrix A we choose arbitrary k rows and k columns, 
k ^ inin(.‘<, n). The elements at the intersection of these rows and 
columns constitute a square matrix of order A-, the determinant of 
which is called the kth-order minor of matrix A. We will now be 
interested in tlie orders of tlio.se minors of A which differ from zero, 
namely, the highest one of these orders. In searching for it, it is well 
to bear in mind the following: if all kth-order minors of matrix A 
are zero, then so also are all minors of higher order. Indeed, expanding 
any minor of order k -f- k < k -j- / ^ min (s, n), by the Laplace 
theorem in terms of any k row.s. we represent this minor as a sura 
of minors of order k multiplied by certain minors of order /, thus 
proving that it is zero. 

Let us now jirove the following theorem on the rank of a matrix. 

The highest order of nonzero rninor.s of matrix A is equal to the 
rank of the matrix. 

Proof. Let the highest order of nonzero minors of matrix A 
be r . Let us assume — there is no loss of generality— that the rth-ordcr 
minor D in the upper left corner of the matrix 


1 





\ 

1 

«ii 

• . . ^ 

'■+1 


] 

1 

* 

1 

. . . 

/) ... 

* 4 ♦ ♦ 

• 

1 

» • • ^ 

i 

1 

1 Or\ 

... (Irr 

^r. r- 1 


ru 


^r+i, I • • • I, r ^r+i, rt-i • • • u 
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is different from zero, D 0. Then the first r columns of A will 
be linearly independent: if the dependence were linear, then, since 
corresponding components are combined in the addition of vectors, 
this same linear dependence would exist among the columns of 
minor D and therefore D would be zero. 

Now let us prove that each h\\ column of i4, r < f ^ n, is a linear 
combination of the first r columns. We take any /, 1 ^ i ^ 5 , and 
construct an auxiliary determinant of order (r + 1): 



flu 

... (2 jr 

dll 

I! 

< 

4 4 

drl 

« • • 

« 

dr/ 


a,i 

• • • ^ i r 

dil 


obtained by “bordering” the minor D by appropriate elements of the 
lih column and the ah row. Determinant A,- is zero for any i. Indeed, 
if i > r, then is a minor of order (r 1) of our matrix A and 
therefore is zero due to the clioice of the number r. But if i ^ r. 
then Aj can no longer be a minor of matri.x A since it cannot be 
obtained by deleting from this matrix certain of its rows and columns; 
however, determinant A^ now has two equal rows and, hence, is 

again zero. . , 

Let us examine the cofactors of the elements of the last row 

of determinant Aj. Obviously, the cofactor of the element an is 
minor D. But if then for the cofactor of element a,-; 

in Aj we have the number 




('■+1)+; 


flj, ... fl|. >-j ^+1 


fljr dll 


dri ••• dr,J-\ ^r.y+l 

It is not dependent on / and therefore is denoted by Aj. Thus, expand- 
ing determinant A; about its last row and equating this e.xpansioii 
to zero, .since Aj = 0. we get 

OnAf “J* dj'iAo —,'■■■ 'T diiD 0 

whence, because D ^0, 

Ai Az ^ Ar 

an — — 77“<i — 77 z> “tr 


J> 


This equation holds true for all /, i — 1, 2, . . ., s, and since 
its coeflicients are not dependent on /, we find that tho entire llh 
column of i4 is a sum of the first r columns taken, respectively, with 

the coefficients — ^ 


J 
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In the system of columns of matrix A we have thus found a maxi- 
mal linearly independent subsystem consisting of r columns. This 
is proof that the rank of matrix A is equal to r, and it completes the 
proof of the rank theorem. 

This theorem provides a practical method for computing the 
rank of a matrix and therefore for settling the question of the exi- 
stence of linear dependence in a given system of vectors; forming 
a matrix for which the given vectors serve as columns and computing 
the rank of the matrix, wo find the maximum number of linearly 
independent vectors of our system. 

The method of finding the rank of a matrix based on the rank 
theorem requires computing a finite but perhaps very large number 
of minors of the matrix. The following remark suggests a way of 
substantially simplifying this procedure. If the reader will again 
look through the proof of the rank theorem, he will notice that in 
the proof we did not lake advantage of the fact that all minors of 
order (r -f- 1) of matrix A are equal to zero; actually, we used only 
those minors of order (r -r 1) which border the given nonzero rth- 
order minor D (that is, those which contain it completely within 
themselves), for this reason, from the fact that only these minors 
are equal to zero it follows that r is the maximum number of linearly 
independent columns of matrix A; this implies that all minors of 
order (/• 1) of this matrix are zero. We arrive at the following 

rule for evalualing the rank of a matrix. 

move from minors of smaller 
order to minors of greater order. If a nonzero kth-order minor D has 

alreadi, been found, then only the (A*-}- \)th-order minors bordering 
minor D need be computed; if they arc oil zero, the rank of the matrix 

I r> • 


Fxainplc 1. Find the rank of the nialri.v 



vor, 


The 

tlie 


second-order iniiK'r in 
matrix .-dso coiilaiiis 


tlie upper left corner of this matrix is zero. Ho^Ye- 
nonzero minors of order two, for instance. 




2 -4 3 

1 -2 1 
0 1 -1 


The tliird-order minor 
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bordering minor d is different from zero, d' = 1, but both fourth-order minor* 
bordering minor d' are zero: 


2 

-4 

3 

1 1 


2 

-4 

3 0 

1 

-2 

1 

1 

-4 ' 

1 

= 0, 

1 

-2 

1 2 

0 

1 

-1 

3 

0 

1 

-1 1 

4 

-7 

4 

-4 


4 

-7 

4 5 


Thus, the rank of matrix A is three. 

Example 2. Find the maximal linearly independent subsystem in the system 
of vectors 

ai = (2, -2, -4), as = (1, 9. 3). aa = (-2. -4. 1), a* = (3, 7, -1} 

Form the matrix 

. 21-2 3. 

-2 9-4 7 

V-4 3 1 -1/ 

iti which the given vectors are columns. The rank of this matrix is two: the 
second-order minor in the upjier left corner is nonzero, but both tliird-order 
minors bordering it are zero. From this it follows that the vectors Oi, form 
in the given system one of maximal linearly independent subsystems. 

As a corollary to the rank theorem, we now prove an assertion 
that was stated earlier. 

The maximum number of linearly independent rows of any matrix 
is equal to the maximum number of its linearly independent columns, 
which means that it is equal to the rank of the matrix. 

To prove this, take the transpose of the matrix (that is, inter- 
change rows and columns retaining the subscripts of the elements). 
In taking the transpose, the maximal order of nonzero minors of 
the matrix cannot change since taking transposes does not change 
the determinant, and for any minor of the original matrix the minor 
obtained from it by taking the transpose is in the new matrix, and 
conversely. Whence it follows that the rank of the new matrix is 
equal to the rank of the original matrix; it is also equal to the maxi- 
mum number of linearly independent columns of the new matrix 
(or the maximum number of linearly independent rows of the ori- 
ginal matrix). 

Example. In See. 8 wo Introduced the concept of a linear form in n un- 
knowns and defiDod addition of linear forms and their multiplication by a sca- 
lar. This delinilion permits extending to linear forms the concept of linear 
dependence with all its properties. 

Let there bo a system of linear forms 

fi = ^1 + 2 l 2 + Xs -f- 3^4, 

/, = 4x, — xz — 5x3 — 6 x 4 , 

Is— xi — 3^2 — 4 x 3 — 7x*, 

fi = 2xi -|- X2 — X 3 

In it we have to choose a maximal linearly independent subsystem. 
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Form tlio matrix of the coefficients of those forms: 



^uiil liatl its rank. The socond-ordor minor in the upper left corner is nonzero, 
hut, as can easily ho verified, all four third-order minors bordering it are zero. 
Whoncu it follows that the lir.st two rows of our matrix are linearly independent, 
and the lliird and fourth are linear combinations 4»f them. Hence, the system 
/i. /: is the desired subsystem of the jjiven system of linear forms. 


Thero is yet another important consequence of the rank theorem. 

An nth-order determinant is equal to zero if and only if there is 
a linear dependence among its rows. 

This assertion has already been proved in one direction in Sec. 4 
(I’roperly 8 ). Now let there be given an Hth-order determinant equal 
to zero; in other words, suppose wo have a .square matri.x of order 
n whose only minor having maximal order is zero. It then follows that 
the highest order of the nonzero minors of this matrix is less than n. 
Hint is, the rank is less than n, and so, on the basis of the foregoing 
proof, the row.s of this matrix are linearly dependent. 

Quite naturally, this corollary can be stated with columns taken 
insleiul of rows. 


f rC' 1 k f ^ o f i 1 0 1. way to compute the rank of a matri.x which 


is not connected with the rank tlieorem and does not require evaluat- 
ing determinants. Incidentally, it is only applicable when we wish 
to know only the rank itself and are not interested in preci.sely which 
colnmns (or rows) comprise the maximal linearly independent system. 
The procedure is tins. 

We use the term elementary transformations of a matrix .1 for the 
following transformations: 

(a) interchange (transposition) of two r(tws nr two columns; 

(b) multiplication of a row (or a column) by an arbitrary non- 
zi*ro scalar; 

(c) addition of a multiple of one row (or column) to another row 
(column). 

Clearly, elementary transformations do not change the rank of a 
matrix. IiuIimhI. if tlu'.so transformations are applied, say, to the 
columns of a matrix, the sy.stem of columns (regarded as vectors) is 
replaced by an equivalent sy.stem. We prove it for transformation (c) 
since for (a) and (h) it is obvious. Let the /th column multiplied by 
a number k be ad<led t() the zlh column. If. prior to the manipulation, 
(tie vectors 



• y 




'L . . 


• * 


( 1 ) 
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served as columns of the matri.v, then after the manipulation the 
vectors 

tti, . . a'i — ai kaj cc; (2) 

will form the columns of the malri.x. System (2) is expre.«sil)le 
linearly in terms of system {!). and the equation 

ttj = aj — koij 

shows that (1), in turn, is linearly expressible in terms of (2). Conse- 
quently, these systems are equivalent and for this reason their maxi- 
mal linearly independent subsystems consist of the same number of 
vectors. 

Thus, when computing the rank of a matrix, the matrix may 
first be simplified by means of a combination of elementary trans- 
formations. 

We say that an 5 by /i matrix has diagonal form if all its element.s 
are zero except the elements flu, a 

22’ • • • * ^ tr (where 

< min (s, n)|, which are equal to unity. The rank of this matrix 
is obviously r. 

Using elementary transformations, it is possible to reduce any 
matrix to diagonal form. 

Indeed, suppo.so we have a matrix 

flu ... fljn \ 

• • • « I 

^.<1 • ' • / 

If all the elements are zero, then it already has diagonal form. But 
if there are nonzero elements, then an interchange of rows and 
columns will change clement flu to a nonzero element. Then liy 
multiplying the first row by fl7i\ we convert element a,, to unity. 
And if we now subtract from the /Ih column, j >• 1, the first column 
multiplied by ajy, then element a^ will bo replaced by a zero. 
Manipulating in similar fashion all columns beyond the first, and 
also all rows, we arrive at a matrix of the form 

10 ... 0 
0 fl '2 . . . fljn 

• # * • « 

0 fl^2 • • ■ ^tn 

Performing the same manipulations with the submatrix that remains 
in the lower right corner, and .so on, we finally— after a finite number 
of manipulations— arrive at a diagonal matrix with the .same rank 
as the original matrix A. 
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Thus, to find the rank of a matrix it is necessary to convert the 
matrix, by means of elementary transformations, to diagonal form and 

count the number of units in the principal diagonal. 

Example. Find the rank of the matrix 




Interchanging the first and second columns and multiplying the first row 


tlie number , 


we get the matrix 



Adding two times the first columu to the third column and then adding some 
multijile of the new first row to each of the remaining rows, we get the matrix 



Finally, multiplying the second row liy —I. subtracting from the tliird column 
three times the second colunin, ami then subtracting from the third and fifth 
rows cerlniii imiltiples of tlie in’w second row, we arrive at tlie desired diagonal 
form 



The rank of the matrix A is thus two. 

Iti Chapter 13 we 
diagonal matrices; true, 
iiotiiials, not miinhers. 


will again encounter elementary transformations and 
these will he matrices in wiiich the elements are poly- 


II. SysU'ius of Linear Kquations 

Wo now begin the study of arbilrnry systems of linear equations 
without any assumptions concerning the number of equations of 
a system being equal to the number of unknowns. Incidentally, the 
resulls we achieve will he applicable to the case (not considered 
in Sec. 7) when the number of equations is equal to the number of 
unknowns, but tlie determinant of the system is zero. 
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Suppose we have a system of linear equations 


^11^1 H” ^12*^2 

• • • ' I “ 

% • # 4 # 

^sl^l ”1” ^s2^2 ~l~ • • • 4~ ^sn^n ” 


As we know from Sec. 1, the first thing is to decide whether the 
system is consistent or not. For this purpose, take the coefficient 

matrix A of the system and the augmented matrix A obtained by 
adjoining to A a column made up of the constant terms, 


/ ^11 ^J2 • • • ^\n 
■— I ® 2 l ^22 * • * ^ 2/1 




A = 


fljl <7 i2 . . . 

^ I Q <\*\ • • • 

^sl ^s2 • • • ^sn^s 


and evaluate the ranks of these matrices. It is easy to see that the 

rank of matrix A is either equal to the rank of matrix A or exceeds the 
latter by unity. Indeed, take a certain maximal linearly independent 
system of columns of matrix A. It will also be linearly independent 

in matrix A. If it also retains the property of maximality, that is, 
the column of the constant terms is expres.sible linearly in terms of it, 

then the ranks of matrices A and A arc equal; otherwise, adjoining 
to this system a column made up of constant terms yields a linearly 

independent system of columns of matrix A, which is maximal in it. 

The question of consistency of a system of linear equations is fully 
resolved by the following theorem. 

Kronccker-Capolli tlieorem. A system of linear equations (1) is 

consistent if and only if the rank of the augmented matrix A is equal 
to the rank of the matrix A. 

Proof. 1. Let system (1) be consistent and let A*,, Ao, . . ., A„ 
bo one of its solutions. Substituting these numbers, in place of the 
unknowns, into (1), we get s identities, which show that the lost 
column of A is the sum of all the remaining columns taken, respecti- 
vely, with the coefficients Ai, Aj, . . A„. Any other column of A is 

also in A and therefore is expressible linearly in terms of all the 
columns of this matrix. Conversely, any column of matrix A is 

a column of A as well, that is, it is linearly expressible in terms of 
the columns of this matrix. From this it follows that the systems 
of columns of matrices A and A are equivalent and therefore, as 
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proved at the end of Sec. 9, both these systems of 5-dimensional 
vectors have one and the same rank; in other words, the ranks of 

the matrices A and A are equal. 

2. Now suppose that the matrices A and A have equal ranks. 
It then follows that any maximal linearly independent system of 
columns of A remains a maximal linearly independent system in 

matrix A as well. For this reason, the last column of A can be expres- 
sed linearly in terms of this system and therefore, generally, in terms 
of the system of columns of matrix A. Consequently, there e.xists 
a system of coefficients A'l, kn such that the sum of the 

columns of taken with these coefficients is equal to the column of 
constant terms, and therefore the numbers Ai, Ag, . . A^ constitute 
a solution of system (1). Thus, coincidence of the ranks of matrices v4 

and A implies that system (1) is consistent. 

The proof is complete. In practical situations, it is first necessary 
to compute the rank of matrix A; to do this, find one of the nonzero 
minors of the matrix .such that all the minors bordering it are zero. 

Let it he the minor M. Then compute all the minors of matrix A 
bordering M but not contained in .1 [the so-called characteristic 
determinants of system (1)). If they are all zero, then the rank of 

matrix A is equal to the rank of matrix .1 and therefore system (1) 
is con.sislent, otherwi.se it is not consistent. Thus, the Kronecker- 
Capelli theorem may be stated as follows: a system of linear equations 
(1) f.s’ consistent if and only if all its characteristic determinants are 
equal to zero. 

Let us now .suppose that system (1) is consistent. The Kronecker- 
Capelli llieorom wliich we used to establish the consistency of this 
sy.'^tem slates that a solution exists. However, it does not give us 
any jiraclical method for finding all the .‘Jolutions of the system. We 
shall now inve.sligalo this problem. 

Let matrix A have rank r. As was proved in the preceding section, 
r is ('qua! 1 (j the maximum number of linearly independent rows 
(it matrix .1 . To be .specific let the first r rows of .-I be linearly indepen- 
dent, and let each of the remaiiung rows be a linear combination 

of Ihf'in. Ihen the fii'st r rows of .1 will also be linearly independent: 
any linear dependence between them would also be a linear depen- 
dence among the first r rows of A (recall the definition of addition 

of vector.s!). From coincidence of the ranks of matrices A and A it 

follows that the first r rows of .4 constitute, in it, a maximal linearly 
independent sy.stern of rows: in other words, any other row of this 
matrix is a linear combination of them. 

It follows, then, that any equation of system (1) can be represent- 
ed a.s a sum of the first r equations taken with certain coefficients 
and therefore any general solution of the first r equations will satisfy 
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all the equations of (1). Consequently, it suffices to find all the solu- 
tions of the system 


^rl-^1 ~1~ ”t' • • • "i ^ru'^it — 

Since the rows of coefficients of the unknowns in equations (2) 
are linearly independent, that is the matrix of the coefficients has 
rank r, it follows that r,:^ n and, besides, that at least one of the 
minors of order r of tliis matrix is nonzero. If r = n, then (2) is 
a system with an equal number of equations and unknowns and 
with a nonzero determinant; that is, it, and for thi.s reason system (1) 
as well, has a unique solution, namely, that which is calculable 
by the Cramer rule. 

Now let ran and, for definiteness, let the /-th-order minor 
made up of the coefficients of the finst r unknowns he different from 
zero. In each of the equations of (2), transpo.se to the right side all 
terms with the unknowns jv+i* • • •» and for these unknowns 
select certain valuesCr+j, . . ., c„. Weobtaina sy.stein of r equations: 

^[2^2 “' • • • ^^1 ^^1, r + l^r + 1 . . . 0^/^C|,y 

^2l^i ' 0.>2^2 “ b.i f^[Cr + i ... — 


^rl-Tj ■: ^r2^2 "h • • • ^rr^r ' bf Of^ r + l^rH 1 ... — 

( 3 ) 

in the r unknowns xi, x.^, . . ., x^. Cramer’s rule is applicable and 
therefore the system has a unique solution c,, Cj, . . ., c^; it is 
obvious that the set of numbers c,, Cj, . . ., c^, . . . , will 

servo as a solution of system (2). Since the values c^+i. • • •, c„ 
for the unknowns x^ + i, . . ., x„, called free unknowns, can be cho- 
sen in arbitrary fashion, we obtain an infinity of distinct solutions 
of system (2). 

On the other hand, any solution of (2) may be obtained in the 
indicated way: if some solution Ci, c^, .... of (2) is given, then 

we lake the numbers Cr+i, . . for the values of the free unknowns. 

Then the numbers Cj, Cj, . . ., will satisfy system (3) and there- 
fore will constitute the only .solution of the system, which solution 
is computed by Cramer’s rule. 

The foregoing may be combined into a rule for the solution of 
an arbitrary system of linear equations. 

Let there be a consistent system of linear equations (!) and let the 
matrix A of the coefficients have rank r. In A we choose r linearly inde^ 
pendent rows and leave in (i) only those equations whose coefficients 
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lie in the chosen rows. In these equations we leave in the left members 
r unknowns such that the determinant of their coefficients is nonzero, 
the remaining unknowns are called free and are transposed to the right 
sides of the equations. Assigning arbitrary numerical values to the free 
unknowns and computing the values of the remaining unknowns by 
Cramer's rule, we obtain all the solutions of system (1). 

We also state the following result that we have obtained. 

A consistent system (1) has a unique solution if and only if the rank 
of matrix A is equal to the number of unknowns. 

Example 1. Solve the system 

5ri — X 2 -r 2 j: 3 “T -li = ', "I 
21, -r Z2 + 4 t 3 — 2x< = i , > 

X, — 3x2 — 6^3 "T 5x4 = 0 j 


The rank of the coefficient matrix is two: the second-order minor in the upper 
left corner of this matrix is nonzero, but both third-order minors bordering it 
are zero. The rank of the augmented matrix is three, since 


5-17 
2 1 1 
1 -3 0 


-35 ^ 0 


TliP .system is thus inconsistent. 
Example 2. Solve the system 

7x, -}- 3x2 = 

X, — 2X2 = 

4xi -f = 



Tlie rank of the coefficient matrix Is two, i.e., it is eiiual to the number 
of unknowns; tin* rank of the augmented matrix is also two. Thus, the system 
is consistent and has a unique solution. The left-hand sides of the first two 
equations are linearly independent; solving the system of these two equations, 
we gel it lie values 






for the unknowns. It is easy to see that this solution also .satisfies the third 
equation. 

Example 3. Solve the system 

J"! -r -T" — 2x3 — X;, = t , 1 

3x, — X 2 X 3 -f 4x4 + 3x5 = 4 , y 

+ 5X2 — ^^3 — 8xi -J- X5 = 0 J 

The system is consistent since the rank of the augmented matrix (like 

the rank of the matrix of coefficients) is two. The left members of the first and 
tliird equations arc linearly independent since the coefficients of the unknowns 
X, and X 2 constitute a nonzero minor of order two. Solve the system of these 
two equations, the unknowns xa, x;, xs being considered free; transpose them 
to the right members of the equations and assume that they have been given 
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cortain numerical values. Using Cramer’s rule, we get 

5 , 1 3 



— ^+-^- ^3 + 4 

4 4 4 



These equations determine llie g<’n^rfl/ so/u/ion of the given system: assign- 
ing arbitrary numerical values to the free unknowns, we obtain all the solu- 
tions of our system. Thus, for example, the vectors (2, 5, 3. 0, 0), (3. 5. 

2, 1, —2), ^0, — f- ^’t) the 

other hand, substituting the expressions for x, and xj from the general solution 
into any one of the equations of the system, say the second, which was earlier 
rejected, we obtain an identity. 

Example 4. Solve the system 

4xi -f T; — 2x3 -f- I; = 3. 

xi — 2x2 — J's + 2xi = 2, I 

2 x, -b 5X2 — 3^4 = — 1 . ! 

3x, -|- 3 x 0 — X 3 — 3 x 4 = 1 


.\lthough the number of equations is equal to the iiumher of iinknowns, 
the determinant of the system is zero and. therefore. Cramer’s rule is not appli- 
cable. The rank of the coefficient matrix is equal to three— in the upper right 
corner of this matrix is a nonzero third-order minor. The rank of the augmented 
matrix is also three, so the system is consistent. Considering only the first 
lliree equations and taking the unknown xj as free, we obtain the general solu- 
tion in the form 




Example 5. Suppose we have a system consisting of n -f 1 equations In n 

unknowns. The augmented matrix A of this system is a square matrix of order 
n -j- 1. If our system is consistent, then, by the Kronecker-Capelli theorem, 

the determinant of A must be zero. 

Thus, let there he a system 

2xi -b X; = 1, i 

4xj -b 7x2 = —'4 J 


The determinant of the coefficients and the constant terms of these equations 
is different from zero; 


1 -8 3 

2 1 1 

4 7 -4 



The system is therefore inconsistent. 

Thc__converse, generally speaking, is not true: from the determinant of 

matrix A being zero it does not follow that the ranks of matrices A and 7 \ 
coincide. 

e-dfic 
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12. Systems of Homogeneous Linear Equations 

Let us apply the findings of the preceding section to the case of 
a system of homogeneous linear equations: 


a\iX\ ”1” “i“ • • • "T” a^nXfi 0, 

““ • • ■ a-2jiXfi ~~ 0 . 


( 1 ) 


,,.ri 






0 


From the Kronecker-Capelli theorem it follows that this system 
is always consistent, since adding a column of zeros cannot raise 
the rank of the matrix. This incidentally is evident by a simple 
inspection— system (1) definitely has a trivial solution (0, 0, . . 0). 

Let the coefficient matrix A of system (1) have rank r. If r = n, 
then the trivial solution will be the only solution of (1); for r C n, 
the system has also nontrivial solutions] to find all these solutions, use 
I he same technique as above in the case of an arbitrary system of 
equations. In particular, a system of n homogeneous linear equations 
in n unknowns has nontrivial solutions if and only if the determinant 
of the system is zero.* Indeed, the fact that the determinant is zero 
is equivalent to the a.s.sertion that the rank of matrix is less than n. 
On tlu* other hand, if in a system of homogeneous equations the number 
of equations is less than the number of unknoums, then the syste?n must 
definitely have solutions different from zero, since in that case the rank 
caiinol Ite i‘(tual to the number of unknowns. This was already 
uhtaiiied in Sec. 1 by other reasoning. 

Let us. for example, examine the ca.se of a system consisting of 
n — 1 liomogeneous equations in n unknowns; assume that the left 
members of these equations are linearly independent among tiiem- 
.<elves. Let 



lie the matrix of the coefficients of this .system. Denote by Mi the 
minor of ordi r — I obtained by deleting the ilh column from /I, 
i -- 1,2, .. n. Tlum for one of the solutions of our system we have 
the set of numbers 

Mu -M,, .1/3, -Mu . . ., (-l)’'-Ll/„ (2) 

and any other solution is proportional to it. 


• One half of this a>sertion was already prtjved in' Sec. 7. 
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Proof. Since, by hypothesis, the rank of matrix A is n — 1, 
one of the minors must be nonzero; let it be il/^. Assume the 
unknown to be free and transpose it to the right side of each of the 
equations. We then get 

^11*^1 “h ^12^2 “1~ . • . “ f ~ ^1, ~ 

^21^1 “f~ • • • “h ^ 2 , n-i^n-i — 


®n-l, H“ ^n-l. 2-^2 "!"••• * ^n-!, /i-l-T/i-j — ^h-I, n^n 

Applying Cramer’s rule, we obtain the general solution of the given 
system of equations, which, after simple manipulations, becomes 

= 1 = 1, 2, .... n-1 (3) 

Setting Xn = (— we obtain: = (_ l)2«-‘-iyi/., i 

2, . . ., n—i, or, since the difference (2n — / — 1) _ — 

— 2n — 2i is an even number, x; = (— that is, the set 
of numbers (2) will indeed be a solution of our system of equations. 
Any other solution of this system is obtained from formulas (3) 
for a difierent numerical value of the unknown and so is propor- 
tional to solution (2). This assertion clearly holds true for the case 
when Mn — 0, but one of the minors 1 < f ^ n — 1, is nonzero. 

Solutions of a system of homogeneous linear equations have 
the following properties. If the vector (5 = (6,, 6,* • • -t b^) is 
a solution of system (1), then for any scalar k the vector = 

— (/cbi, kb^, .... kbn) is also a solution of the system. This is veri- 

fied directly by substitution into any one of the equations (1). If the 
vector y = (cj, c,, .... c„) is another solution of (1) then the vector 
P + V = (^1 + ^2 + • ■* b„ + Cn) is also a solution of the 

system: 

n n n 

2 + S S = 0 f = l, 2, ...,s 

j«=l ;«l 

Thus, generally, any linear combination of solutions of the homogeneous 
system (1) is a solution of the system. Note that in the case of a non- 
hdmogeneous system, that is, a system of linear equations whose 
constant terms are not all equal to zero, no such assertion is true: 
neither the sum of two solutions of a system of nonhomogeneous 
equations nor the product of a solution of the system by a scalar can 
serve as solutions of the system. 

From Sec. 9. we know that any system of n-dimensional vectors 
consisting of more than n vectors will be linearly dependent. Whence 
it follows that from a number of solutions of the homogeneous system 
(1), which solutions, as we know, are n-dimensional vectors, it is 


6 * 
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possible to choose a iinite maximal linearly independent system, 
that is, maximal in the sense that any other solution of system (i) 
will be a linear combination of the solutions that enter into the chosen 
system. Any maximal linearly independent system of solutions of 
the homogeneous system of equations (1) is called its fundamental 
system of solutions. 

Let us once again stress the fact that an n-dimensional vector is 
a solution of system (1) if and only if it is a linear combination of vectors 
comprising the given fundamental system. 

Quite naturally, the fundamental system exists only if system (1) 
has nontrivial solutions, that is, if the rank of its matrix of coeffi- 
cients is less than the number of unknowns. Then system (1) can 
have many different fundamental systems of solutions. All these 
.systems are equivalent however, since each vector of any one of the 
systems is linearly expressible in terras of any other system, and 
for this reason the systems consist of one and the sajne number of 
solutions. 

The following theorem is valid. 

If the rank r of the coefficient matrix of the system of homogeneous 
linear equations (1) is less than the number of unknowns n, then any 
fundamental system of solutions of (i) consists of n — r solutions. 

To prove this, note that n — r is the number of free unknowns 
in system (1); let the unknowns -rr+j, . . ., be free. We 

consider an arbitrary nonzero determinant d of order n r. which 
we write as follows: 



^1, r + l’ ^1. r+ 2 ’ • • •• 

r + I> ^2. r+ 2 ’ • • •• 


^n-r, r + l’ ^n-r, r+2 


Oi-r. n 


Taking elemonis of the ilh row of this determinant, 1 ^ ^ n — r, 

for the values of the free unknowns, we get unique values for the 
unknowns .ri, a., . . x^. In other words, we arrive at a quite defi- 

nite solution of tile .system (1) of equations. Let us write the solution 
in llie form of a vector: 



The set of vectors aj, a^, . . ., that we have |Obtained 

serves as a fundamental system of solutions for the sj^stem (1) of 
equations. Indeed, this set of vectors is linearly independent since 
the matrix made up of them (as rows) contains a nonzero minor d 
of order n — r. On the other hand, let 

P ~ b-i, . . ., bft b^+i. bf^n bn) 
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be an arbitrary solution of system (1). We will prove that the vector p 
can be expressed linearly in terms of the vectors ai, .... 

Denote by a'i, i = 1 , 2, r. the ii\\ row of the determi- 

nant d; regard this row as an (h - r)-dimensioiial vector. Ihen set 

= {bf + iy ^r+ 2 ’ ■ ' ■’ 

The vectors al, / = 1. 2 » - r. are linearly independent 

since d =^0. However, the system of (n - r)-(iiinensional vectors 

f f f Qf 

• • • • P 

is linearly dependent since the number of vectors in it is greater than 
their dimensionality. Hence there are scalars ki, /w such 

that . , , //V 

P' = A-ja^ + A„_ran_r {‘i) 

Now consider the n-dimensional vector 

6 = AjOti “t“ k2^i ■ T A,|_rCt7,_r P 

Since the vector 6 is a linear combination of the solutions of the 
system (1) of homogeneous equations, it will be a solution of the 
system. From (4) it follows that in the 6 solution the values of 
all the free unknowns are zero. However, the unique solution of 
system (1) which is obtained for zero values of the free unknowns 
will be a trivial solution. Thus, 6 = 0, that is, 

= /‘iCCj “b k.iOL.i d" . . . T A„_rCt„_r 

which proves the theorem. 

Note that the foregoing proof permits us to assert that we will 
obtain all the fundamental systems of solutions of the system (1) 
of homogeneous equations by taking for d all possible nonzero deter- 
minants of order n — r. 


Example. Given the following system of homogeneous linear equations: 

3xi + Xz — 83*3 2X4 + ^5 = 0| 'j 

2i, — 2x2 — 3^3 — 7x4 T 2x5 = 0, I 

I, + 11X2 — 12X3 + 34X4 — 5x5 = 0. I 

X, — 5 X 2 + 2x3—1 614 H- 3X5 = 0 ^ 

The rank of the coeflicienl matrix is two, the number of unknowns is equal 
to live* therefore every fundamental s>'stem of solutions of this system of equa- 
tions consists of three solutions. Wo solve the system confining ourselves to the 
first two linearly independent equations and considering X 3 , X 4 , xj as free un- 
knowns. We obtain the general solution in the form 



19 , 3 1 

•g 3-3 4-8 


7 25 , 1 

X 2 — g ^3 g 2 

Then we take the next three linearly independent three-dimensional vectors 
(1, 0, 0), (0, 1, 0), (0. 0, 1). Substituting the components of each of them 
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into the general solution as values for the free unknowns and computing the 
values for xj and xg, we get the following fundamental system oi solutions 
of the given system of equations: 







We conclude this section by considering the relationship between 
the solutions of nonhomogeneous and homogeneous systems. Suppose 
we have a system of nonhomogeneous linear equations 


“ 1 “ QnnXn 


^Xn^n 


^ 2n •t'/i 


f>2, 









Tlie system of homogeneous linear equations 



* f” " f " ... ““ fl* 

* * « # « • 

obtained from (b) by replacing the constant terms by zeros is called 
the reduced sijatem of (.b). There is a close connection between the 
solutions of (.">) and (b), as the following two theorems indicate. 

1. The sum of anij solution of system (5) and any solution of the 
reduced system (b) is apnin a solution of system (5). 

Indi-ed. let Cl. c,. be a solution of (5). and rf,, d^, . . 

a solution of (b). Take any one of the equations of system (5), say 
tl»e /.-Ih. and substitute into it the numbers Ci + di, do, . . . 

. . c„ dn in place of the unknowns. We get 



V 



u n 

-- S (ihjf'j -r ^ Ohjdj ~ bu -f-0 ^ bh 
j*i - 


n. The difierrnce bchceen any tiro solutions of (5) is a solution of (6). 


Indeed, lei 


1- 


r... 


and 


c. 


c'n be solutions of 


system (5). Take any one of the equations of (6), say the ytth, and 
substitiitf into it in place of the unknowns the numbers 


■ n 

UMClS 


• » 4 

f. — 

/ 


n 


n 




^ (o- 

« A 

n) ' "I HhjCj - 

V 

akjc'j - 

hh ~ bii 

-0 

;-l 







It follows from these theorems that by finding one solution of the 
system (5) of nonhomogeneous linear equations and adding it to every 
srlntion of the reduced system (0), we obtain all solutions of (5). 


CHAPTER 3 


THE ALGEBRA 
OF MATRICES 


13. Matrix Multiplication 

In the preceding chapters the concept of a matrix was utilized 
as an essential auxiliary tool in the study of systems of linear equa- 
tions. Numerous other applications have made it the subject of a 
large independent theory, many branches of which go beyond the 
limits of this course. We shall now discuss the fundamentals of this 
theory which starts with the fact that two algebraic operations, 
addition and multiplication, are defined in the set of all square matri- 
ces of a given order in a very peculiar but fully motivated fashion. 
We begin with the multiplication of matrices; addition will be intro- 
duced in Sec. 15. , . . , ,i 

From the course of analytic geometry we know that when the 

axes of a rectangular coordinate system in the plane are rotated 
through an angle a, the coordinates of a point are transformed accord- 
ing to the following formulasi 

X = a;' cos a — y sin a, 
y = sin a + y' cos a 

where x and y are the old coordinates of the point, and x', y' are the 
new coordinates. Thus, x and y are expressed linearly in terms of 
x' and y' with certain numerical coefficients. There are also many 
other instances of the substitution of unknowns (or variables) in 
which the old unknowns are linearly expressed in terms of the new 
ones. Such a substitution of unknowns is ordinarily called a linear 
transformation (or linear substitution). We thus arrive at the follow- 
ing definition. 

A linear transformation of unknowns is a transition from a set 

of n unknowns Xi, Xo, - • Xn to a set of n unknowns j/i, 1/21 

such that the old unknowns arc expressed linearly in terms of the 
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new ones with certain numerical coefficients:] 


•n — ilii.Vi T- “ . . . — a^nljn, 

.r.2 ^ 221/2 ~ • T 

• • « • • 

• • # • • • • 

^^t>\y\ “T ~~ • • • “f ^nnl/n 




The linear transformation (1) is fully determined its coef- 
ficient matrix 



since two linear transformations of the same matrix can differ only 
in the letters denoting the unknowns; we take it, however, that the 
choice of these designations is wholly in our own hands. Conversely, 
specifying an arbitrary matrix of order n, we can immediately write 
the linear transformation for which this matrix serves as the coef- 
ficient matrix. Thus, there is a one-to-one correspondence between 
the linear transformations of n unknowns and the square matrices 
of order n. Therefore, every concept involving linear transformations 
and every property of these transformations must be associated with 
a similar concept or property involving matrices. 

Lot us examine the problem of a successive performance of two 
linear Iransformation.s. Suppose that following the linear transfor- 
mation (1) there is effected a linear transformation like 


!h ' r , ... — 


( 2 ) 


wlikh lakes the set of unknowns //,. y,.. . . into the set 

Cj, 2 ;,; denote the matrix of lliis transformation by B. 

Substituting into (1) tlie expressions for //j, y.,, . . ., from (2), 

we get linear expressions for llie unknowns Xu x.., . . in terms 
of the unknowns Zi, .... Thus, /he result of a successive 
execution of two linear Iransformations of unknowns will again be 
a linear transformation. 


bxacnplc. The result of the successive performance of linear transformations 

= 3f/i — y2, tfx = Z2, 

ijt -b 5|/2, 1/2 = iw| -j- 2-2 

is tlie linear transformation 

“ 3 (=, -f- 2 ;) — {Azi 4 - 212 ) = —Zi -r Z2, 

^2 = (=1 -r 22 ) -h 5 (42, + 2;.) = 21z, + Uzz 
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Denote by C the matrix of the linear transformation which 
is the result of the successive performance of transformations (1) 
and (2) and find the law by which its elements c, -ft, 2, . . n 

are expressed in terms of the elements of the matrices A and B. 
Writing down the transformations (1) and (2) succinctly in the form 


n 

Xi — (Jiji y ji ^ •“ 1 1 — i • • • ? n , 

;=i 

we obtain 



n 


fe=l 



n II n T) 

^i~ S ^ S ~ ^ ^hi ^'“1* — j •••< fl 

j=l ft=l j=l 

Thus, the coefficient of in the expression for x,- (that is. (he element 
Cjfc of matrix C) is of the form 

n 

Cih — ^ Oijbjh = aiibih-r Oi 2 b';>ii -r ‘ • -rCiinbnh (3) 

The element of matrix C in the ith row and kth column is equal to 
the sum of the products of the corresponding elements of the ith row 
of matrix A and the kth column of matrix B. 

Formula (3), which expresses the elements of matrix C in terms 
of the elements of matrices A and By permits us to write down C 
immediately, given A and B, without having to examine the linear 
transformations corresponding to the matrices A and B. In this 
fashion, a one-to-one correspondence is set up between any pair 
of square matrices of order n and a definite third matrix. We can 
say that in the set of all square matrices of nth order we have 
defined an algebraic operation which is called matrix multiplica- 
tion, and matrix C is called the product of the matrix A by the 
matrix B: 

C = AB 

Let us once again formulate the relationship between linear 
transformations and matrix multiplication. 

A linear transformation of unknowns obtained as a result of the 
successive performance of two linear transformations of matrices A and 
B has as its coefficient matrix the matrix AB. 

Examples. 

Ill ( 4-1-|-9.(-2) 4.{_3) + 9-U 

- ( -7 i) 

/ 2 0 1\ /-3 1 0\ / -6 I 3\ 

(2) -2 3 2 . 0 2 1 = 6 2 9 

\ 4 -1 5/ V 0-13/ V-12 -3 14/ 

/7 2\2 /7 2\ (7 2\ /51 16\ 

ll ij "“ll ij'll l)"l 8 3 ) 


( 3 ) 
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(4) Find the result of the successive performance of the linear transforma- 
tions 

X, = Sy, — y2 + 3y3» 

= yi — 2^2. 
i3 = 7^2 — yz 


and 


y, = 2zi 


+ Z3, 

yz = Z2 — 523, 
y3 = 222 

Multiplying the matrices, we obtain 


/5 -1 3\ /2 0 1\ /lO 5 10\ 

1-2 0-01-5=2-2 11 

\0 7 -1/ \0 2 0/ \ 0 5 -35/ 


The desired linear transformation is therefore of the form 


X\ — lOzi “i" 522 “T lOza, 
X2 = 22i — 222 “1“ ^^23, 

X3 ~ 522 ”■ 3523 


Take one of the above examples of matrix multiplication, 
say (2), and find the product of the same matrices, but in reverse 
order: 

/_3 1 n\ / 2 0 1\ /-8 3 -1\ 

0 2 t K -2 2 I = I 0 5 0 I 

\ 0 -1 3/ \ /, -I 5/ \ 14 -6 13/ 


We see that the product of the matrices depends on the order 
of the factors; in other words, matrix multiplication is noncommuta- 
iive. Actually, tliis is something we should have expected, if only 
beraiise the matrices A and B are not of equal status in the 
clehnition of matrix C given above by means of formula (3): in A 
we take the rows and in B the columns. 

b.xamples of noncommutative matrices of order n, that is, matri- 
ces whose product changes with an interchange of the factors, may 
be given for all n beyond n = 1 [second-order matrices in Example 
(1) are noncommutative]. On tbe other hand, two given matrices 
may accidentally turn out to be commutative, as witness the follow- 
ing example: 



Matrix multiplication is associative; one can therefore speak of 
a uniquely defined product of any iinite number of matrices of 
order «. taken in a definite order (because of the noncommutativity 
of multiplication). 

Proof. Suppose we have three arbitrary matrices of order n. A, 
/> and C. In abbreviated notation (which indicates the general 


13. MATRIX multiplication 


91 


aspect of their elements) we have A = (au), B = (6i/), C = (Ci;). 
We also introduce the following designations: 

AB = U == [vu). BC = V = ivij], 

{AB) C = S = isu). A{Bq=^T=^ {Ui) 

We have to prove the truth of the equations (^45) C = A {BC). that 
is, S = T. However 

n n 

Uii— S ^kj = S 

ft=l 1=1 

and, therefore, because of the equations S = UC, T ~ AV, 

n n n 

^ij— S S 2 Oth^klCij, 

J=1 /=1A=1 

n n n 

iij= S S ^^ilibhlClj 

k-l 

That is to say, su = fij for i, / = 1, 2 n. 

To go deeper into the properties of matrix multiplication we 
have to study their determinants. For the sake of brevity, we agree 
to denote the determinant of matrix i4 by | .4 |. If in each of the 
above examples the reader will take the pains to count the deter- 
minants of the matrices being multiplied and to compare the product 
of these determinants with the determinant of the product of the 
given matrices, he will detect an extremely curious regularity 
which is expressed as the following very important theort.ni on the 
multiplication of determinants. 

The determinant of a product of several matrices of order n is equal 
to the product of the determinants of these matrices. 

It will suffice to prove this theorem for the case of two matrices. 
Let there be given the nth-order matrices A = ( 0 ^) and B = (6^;) 
and let AB = C = (cu). Construct the following auxiliary deter- 
minant A of order 2n: in the upper left corner put matrix A, in the 
lower right corner, matrix B, the entire upper right corner will be 
occupied by zeros, finally, put the number —1 along the principal 
diagonal of the lower left corner and zeros elsewhere. Determinant 
A will then look like this: 



a 12 . . • 


0 

0 

... 0 


^*22 • * * 


0 

0 

... 0 

# « * 

On j 

On2 ‘ • • 

♦ # » 

Oun 

♦ ♦ 

0 

0 

• • • • • 

... 0 

-1 

0 ... 

0 

bn 

h|2 

. . . 6jn 

0 

-1 ... 

0 

hoi 

^22 

• • • ^20 

• * 

0 

0 ... 

t # 

-1 

• » 

^nt 

» • 

h/i2 

• « * • • 

• • • ^nn 



92 


CH. 3. THE ALGEBRA OF MATRICES 


Applying the Laplace theorem to the determinant A expansion 
about the first n rows— we get the following equation: 

A =\A \-\B \ (4) 

Now let us attempt to transform the determinant A, without 
changing its value, so that all elements bij, i, j = 1, 2, . . 
are replaced by zeros. To do this, add to the {n + l)th column 
of A its first column multiplied by fin, the second multiplied by fc-.i 
and so on, and finally, the nth column, multiplied by 6„i. Then 
add to the (/i ~\- 2)th column of determinant A the first column 
multiplied by the second multiplied by and so on. Gene- 
rally, we add to the (n -f ;)th column of the determinant A, where 
7 = 1, 2, . . n, the sum of the first n columns taken, respectively, 

with the coefficients bij, b^j, . • -t bnj. 

it is easy to see that these manipulations do not change the 
determinant and actually result in the replacement of all ele- 
ments bij by zeros. At the same time, in place of the zeros in the 
Jipper right corner of the determinant there appear the following 
numbers: at the intersection of the ith row and the {n 7 )th column 

of the determinant, L ; = 1. 2 n, will stand the number 

a^hij -f a.o6«; equal (because of (3)1 to the element 

Cij of matrix C = AD. The ui)per right corner of the determinant 
is now occMipied by matrix C: 


\ 


«lt 

(7 12 • • • 


^11 

^12 

... Cj;, 

(^2 1 

(I'i’y • • . 

((271 


Cv. 

. . . t’“ri 
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once 
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ex])andii 


the miiiur | C \ is equal to ( — 1)", and since the minor | C j is located 
in rows with position numl)er.< 1, 2, . . ., n and in columns with 
po.sition numbers n - 1, ri 2, . . ., 2//, and 


1 


n -• (u 1) -h (u -f 2) + . . . 


2u 


2n- — n 


it follows that 






Cl 


or. 


<lue to the evenness of the numl)er 




(5) 
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Finally, from (4) and (5) follows the equation we set out to prove: 

\C\=\A \-\B 1 

The multiplication theorem for determinants could be proved 
without invoking the Laplace theorem. One such proof is given 
nt the end of Sec. 16. 

14. Inverse Matrices 

A square matrix is called singular if its determinant is zero, 
■otherwise it is nonsingular. Accordingly, a linear transformation 
of unknowns is called singular or nonsingular depending on whether 
the coefficient determinant of this transformation is zero or not. 
The following assertion follows from the theorem proved at the 
end of Sec. 13. 

Tile product of matrices, at least one of which is singular, is a singu- 
lar matrix. 

The product of any nonsingular matrices is a nonsingular matrix. 

From this there follows the assertion (because of the relationship 
existing between matrix multiplication and the successive perfor- 
mance of linear transformations): the result of a successive perfor- 
mance of several linear transformations is a nonsingular transforma- 
tion if and only if all the given transformations are nonsingular. 

The role of unity in matrix multiplication is played by the unit 
{identity) matrix 



It commutes with any matrix A of a given order, 

AE EA ^ A (1) 

These equalities are proved either by direct application of the 
rule for multiplying matrices or on the basis of the remark that the 
unit (identity) matrix corresponds to an identical linear transfor- 
mation of unknowns: 

Xi = liu 


— Ifn 

the performance of which, either prior to or following any other 
linear transformation, obviously does not alter that transformation. 
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Note that matrix E is the only matrix which satisfies condition (1) 
for any matrix A. Indeed, if there were also matrix E' with this 
property, we would have 

E'E - r, E'E - E 

whence E' — E. 

The question of whether a given matrix A has an inverse turns 
out to be more complicated. Since matrix multiplication is not 
commutative, we will now speak of the right inverse matrix, that 
is a matrix A~^ such that postmultiplication of A by this matrix 
yields the identity matrix: 

AA-^ E (2) 

Suppose matrix A is singular; then if matrix A'^ existed, the product 
on the left of (2) w'ould, as we know, be a singular matrix, whereas 
in actual fact the matrix E in the right member of this equation 
is nonsingular since its determinant is equal to unity. Thus a singular 
matrix cannot have a right inverse matrix. Similar reasoning shows 
that it cannot have a left inverse matrix either, and for this 
reason, a singular matrix has no inverse at all. 

Passing to the case of a nonsingular matrix, let us first introduce 
the following auxiliary concept. Suppose we have an nth-order 
matrix 



The matrix 



which consi.sts of the cofactors of the elements of A (note that the 
cofactor of element aij lies at the inter.^ection of the ;th row and 
the ith column) is called the adjoint of matrix A. 

Let us find the products AA* and Using the familiar for- 

mula (see Sec. fi) for the expan.^ion of a determinant about a row 
or column, and also the theorem (see Sec. 7) on the sum of the pro- 
ducts of the elements of any row (column) of a determinant by the 
cofactors of the corresponding elements of another row (column) 
and denoting by d the determinant of the matrix .4, 

d - I -4 1 
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we get the following equations: 


AA* = A*A = 




From this it follows that if matrix A is nonsingular, tlwi its 
adjoint A* will also be nonsingular; note that the determinant d* 
of matrix A* is equal to the {n — i)th power of the determinant d of 
matrix A. 

Indeed, passing from (3) to the equality between the determi- 
nants, we get 

dd* = d” 


whence, because d =^0. 

d* = d”'* 


(We could prove that if matrix A is singular, then its adjoin! 
A* is also singular and has rank which does not exceed 1.) 

It is now easy to prove the existence of an inverse matrix for 
any nonsingular matrix A and to find its form. Note first that if 
wo consider the product of two matrices AB and if we divide all 
the elements of one of the factors, say B, by one and the same 
number d, then this number also divides all elements of the product 
AB: to prove this all we need to do is recall the definition of matri.x 
multiplication. Thus, if 

d = \ A \ ^0 


then from (3) it follows that the inverse of A is a matrix obtained from 
the adjoint A* by means of division of all its elements by the number d: 


Indeed, from (3) 



f ^11 

/I 21 

Ani \ 


^ d 

d ' ' ' 

d 


.'1l2 

/1 22 


A-^ = 

d 

d ‘ ’ 

d 


• • 

- ^2n 

• • t 


^ d 

d ' • • 

d J 

follow the equalities 


AA 

-1 _ 

A-M = 

E 



Wc stress once again that the ilh row of matrix A~* contains 
the cofactors of the elements of the ith column of determinant | A | 
divided by d = | .A |. 

It is easy to prove that matrix is the only matrix which 
satisfies condition {4) for a given nonsingular matrix A. True enough, 
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if matrix C is such that 

AC = CA = E 


then 

CAA-^ = C (AA-^) ^ CE = C, 


CAA-^ = (C.4) .1-1 = EA-^ = .4-1 
whence C = .4"i. 

From (4) and the multiplication theorem for determinants it 

follows that the determinant of matrix A~^ is equal to j-^ so that 

this matrix is also nonsingular; its inverse is the matrix A. 

Now, if we have square matrices A and B of order n, of which A 
is nonsingular and B is arbitrary, then we can perform the right 
ajid left divisions of B by A. that is, we can solve the matrix equations 

AX = B, YA = B (5) 


'I'o do this, it will suffice (because of the associativity of matrix 
mnltiplication) to set 

X=A-^B, r = 5.4-1 


These solutions of equations (5) will, in the general case (because 
matrix multiplication is not commutative), he distinct. 

Example 1. Given a matrix 

/ 3 -1 0 
A = \-2 11 

V 2 -1 4 


Us determinant 1 .1 1=5, and so the inverse matrix exists: 



I'xample 2. Given the matrices 



The matrix A is nonsingular, and 



Therefore the 
YA = B: 


following matrices arc solutions of the equations /IX 
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Multiplication of rectangular matrices. Although in the preceding 
section matrix multiplication was only defined for square matrices 
of the same order, it carries over to the case of rectangular matrices A 
and B, provided it is possible to apply formula (3) of the preceding 
section, i.e., if any row of matrix A contains as many elements 

as there are in any column of matrix B. In other words, one can 

speak of the product of rectangular matrices A and B if the number 
of columns of matrix A is equal to the number of rows of matrix B, 
the number of rows of matrix AB being equal to the number of rows 
of A. and the number of columns of matrix A B to the number of columns 
of B. 

/-I 3 0^ 

3 1\ I -2 1 1 I _ /lO 15 -5\ 

1 /J I 3 0 -2 I “ 111 10 loj 

\ /i 1 2/ 


Kxainples. 


( 1 ) 


( 


5 -1 
2 0 


( 2 ) 

(3) 


Multiplication of rectangular matrices may be related to a 
successive performance of linear transformations of the unknowns, 
provided only that in the definition of the latter we give up the 
assumption that the number of unknowns is preserved under the 
linear transformation. 

It is also easy to verify, by repeating word for word the proof 
given above for the case of square matrices, that associativity holds 
true also for the multiplication of rectangular matrices. 

We now take advantage of the multiplication of rectangular 
matrices and of properties of the inverse matrix for a new deriva- 
tion of Cramer’s rule, which does not require the involved compu- 
tations that were carried out in Sec. 7. Let there be given a system 
of n linear equations in n unknowns: 



a,iXi + fljaXa + . . . + = b 

^22^2 b 


it 


a 


n\ 






it 



The determinant of this system is different from zero* Denote by A 
the coefficient matrix of system (6); this matrix is nonsingular 

7-986 
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since, by hypothesis, d = \ A | ^ 0. Denote by X the column of 
unknowns, by B the column of constant terms of (6); thus 



The product AX is meaningful since the number of columns of 
matrix A is equal to the number of rows of matrix X; this product 
will be a column composed of the left-hand sides of the equations 
of system (6). Thus, (6) may be written in the form of a single matrix 
equation 

AX=B (7) 

Miilliplying both sides of (7) on the left by the matrix A~^, the 
existence of which follows from the nonsingular nature of the square 
matrix A. we get 

A' = .4-'^ (8) 


The product on the right is a matrix of one column; its /th 
element is equal to the sum of the products of the elements of the 
/th row of matrix A~^ by the corresponding elements of matrix 5, 
that is, it is equal to the number 







The parenthesis on the right is, however, an expansion about the 
/th column of determinant dj, which is obtained by replacing the 
/th column of d by the column B. Thus, formulas (8) are equivalent 
to formulas (3), Sec. 7, which express the solution obtained by 
Cramer’s rule to system (6). 

It remains to show that the values of the unknowns thus obtained 
are indeed the solution of system (6). To do this, put expression (8) 
into the matrix equation (7); it obviously yields the identity B = B. 

The rank of a product of matrices. In the case of singular matrices, 
the multiplication theorem for determinants does not lead to any 
utterance beyond the fact that their product will also be singular, 
although .‘lingular square matrices can be distinguished according 
to rank as well. Note that there is no completely definite relation- 
ship between the ranks of the factors and the rank of the product, 
as is evident from a glance at the following examples: 
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In both cases, matrices of rank 1 are multiplied, but in one case 
the product has rank 1, in the other, rank 0. It is only the following 
theorem which holds true (and not only for square but for rectangular 
matrices as well). 

The rank of a product of matrices does not exceed the rank of each 
of the factors. 

It will suffice to prove this theorem for the case of two factors. 
Suppose we have matrices A and B for which the product AB is 
meaningful: AB — C. We consider formula (3), Sec. 13, which 
yields an expression for the elements of matrix C. Taking this 
formula for the given k and for all possible i (i = 1, 2, . . .), 
we find that the A*th column of matrix C is the sum of all the columns 
of matrix A taken with certain coefficients (namely, with the coef- 
ficients bjh, b^hy ■ ■ •)• This is proof that the system of columns of 
matrix C is expressed linearly in terms of the system of columns 
of matrix A and, therefore, as shown in Sec. 9, the rank of the first 
system is less than or equal to the rank of the second system; in 
other words, the rank of matrix C does not exceed the rank of matrix 
A. On the other hand, since from this same formula (3), Sec. 13 
there follows, for a given i and for all k, that each ith row of matrix 
C is a linear combination of the rows of matrix B, we find by analo- 
gous reasoning that the rank of C is not greater than the rank of B. 

A more precise result is obtained in the case when one of the 
factors is a nonsingular square matrix. 

The rank of the product obtained by pre~ or postmultiplication 
of an arbitrary matrix A by a nonsingular square matrix Q is equal 
to the rank of matrix A. 

For example, suppose 


AQ = C 


( 9 ) 


From the preceding theorem it follows that the rank of matrix C 
is not greater than the rank of matrix A. However, multiplying (9) 
on the right by we arrive at the equation 


A = CQ 


-1 


and for this reason, again on the basis of the preceding theorem 
the rank of A does not exceed that of C. A comparison of these two 
results proves the coincidence of the ranks of matrices A and C 


15. Matrix Addition and Multiplication 
of a Matrix by a Scalar 

For square matrices of order n, addition is defined as follows 
The sum A B of two square matrices A = (a^y) and B ~ (b \ 
of order n is the matrix C = (co), each element of which is equal 

7 * 
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to tlie sum o! the corresponding elements of matrices A and B: 

Cij = Oij + l?ij* 

The addition of matrices thus defined will obviously^ be 
commutative and associative. The inverse operation also exists; 
the difference between the matrices A and B i.'^ a matrix composed 
of the differences of the corresponding elements of the given matrices. 
Tlie role of zero is played by the zero matrix, composed entirely 
of zeros; this matrix will from now on be denoted by the symbol 0. 
Tliere is no real danger of confusing a zero matrix and the number 

7 0 ro . 

The addition of square matrices and their multiplication as defined 
in Sec. 13 are related by the distributive laws. 

Indeed, suppose we have three matrices of order n, A = ( 0 ^;), 
^ C = (c,-y). Then for any i and j we have the obvious 

equality 

H ^ 

^ (flis ^ aiiCsj -)■• 2 

s-1 s^l 8«1 

However tlie left side of this equation is an element in the iih row 
and /Ih column of the matrix {A + B) C, the right side is an element 
in the same position in the matrix AC BC. This proves the 

ecpiation 

(.1 B)C - - AC - BC 

The equation C {A + B) ^ CA A- CB is proved in exactly the 
same way: the iioncommutalivily of matrix multiplication quite 
naturally requires proof of these two distributive laws. 

Let us introduce the following definition of multiplication of 
matrices by a scalar. 

The product kA of a square matrix A = by a scalar k is the 
matrix .1' {a'ij) obtained by multiplying all elements of the 
matrix .1 by k: 

a',j kn,i 

\Vi‘ have alre.uly encountered (Sec. 14) one such example of 
imillipliealion of a matrix by a scalar: if matrix A is nunsingular, 
and 1 .1 I (/. then its inverse, .HL and the adjoint .-I* are connect- 
ed by th(‘ e(piation 

.1-* d K\* 

As we know, any square matrix of order n may be regarded 
as an n’-dimensional vector: this correspondence between matrices 


* Of course, one could define tlie matrix product in just as natural a way 
muUiplying the corresponding elements. However, such multiplication, 
unlike that defined in Sec. 13, would not find any serious applications. 
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and vectors is one-to-one. The addition of matrices and the multi- 
plication of a matrix by a scalar defined here are then converted 
into the addition of vectors and the multiplication of a vector 
by a scalar. Thus, the collection of square matrices of order n may be 

regarded as an n^-dimensional vector space. 

From this follows the truth of the following equations (here, 
A and B are matrices of order n; k, I are scalars and 1 is the number 

unity): 

A: (/I /?) =" /‘■'I + to 

{k 1) A = kA + lA, (^) 

k{lA) = {kl)A, i'^) 

\-A = A ('■) 

Properties (1) and (2) connect multiplication of a matrix by 

a scalar with addition of matrices. At tlic same lime, there is a very 
important relation.ship between the multiplication of a matrix by 
a scalar and multiplication of the matrices alone, namely, 

IPA) B = A {kB) = k {AB) {^) 

In words if one of the factors in a product of matrices is multiplied 
bu a scalar k, then the whole product is mutUplied by k. 

Let there be matrices A = (au) and B = ib^) and a scalar L 

Then for any i and /, 

n '• 

S (kais) bsj ^ k Oisba) 

»=i 

The left side of this equation, however, is an element in the ith 
row and the ith column of matrix {kA) B, the right side is an element 
in the same place in matrix k (AB). This proves the equation 

{kA) B ^ k {AB) 

The equation A {kB) == k {AB) is proved in the same way. 

The operation of multiplication of a matrix by a scalar permits 
introducing a new mode of matrix notation. Denote by the 
matrix in which unity lies at the intersection of the ith row and 
the /th column, all other elements being zero. Setting i = 1, 2, . . . 

. . ., w, and / = 1, 2, . . n, we obtain n* such matrices Eu, 
which are connected, as may easily be verified, by the following 
multiplication table: 

EifEs) “ E^i^Et) 0 for s ^ t 

The matrix kEu differs from the matrix Etj solely in the fact 
that it has the scalar k at the intersection of the ith row and the 
;th column. Taking this into consideration and using the definition 
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of matrix addition, we get the following notation for an arbitrary 
square matrix A: 

( flu fli2 • • • flin\ 

a,, (6) 

flfil fln2 • • • Ann / 

The matrix A obviously possesses only the notation (6). 

The matrix kE, where E is the unit matrix, has, by the defini- 
tion of multiplication 'of a matrix by a scalar, the following form: 



that is to say, one and the !same scalar k on the principal diagonal 
and all other elements zero. Such matrices are called scalar matrices. 

The definition of matrix addition leads to the equation 

kE -f- IE = {k 1) E (7) 

On the other hand, using the definition of matrix multiplication 
or proceeding from (5), we get 

kE4E^ikl)E (8) 

Multiplication of matrix A by a scalar k may be interpreted as 
multiplication of A by a scalar matrix kE in the meaning of multi- 
plication of matrices. Indeed, by (5), 

{kE) A = A {kE) = kA 

The conclusion to be drawn here is that every scalar matrix 
commutes with any matrix A. It is very important to point out that 
scalar matrices are the only ones with this property. 

If a matrix C = (c,j) of order n commutes with any matrix of the 
same order, then C is a scalar matrix. 

Indeed, set i / and consider the products CEa and EaC 
(which by hypothesis are equal; see above definition of the matrix 
Eli). It is clear that all columns of matrix CEa, except the ;th, 
consist of zeros, and the ;th column coincides with the ith column 
of matrix C\ in particular, element c^ lies at the intersection of 
the ith row and the ;th column of matrix CEij. Similarly all the 
rows of matrix EijC, except the ith, consist of zeros, and the ith 
row coincides with the /th row of matrix C; at the intersection of 
the ith row and the /th column of matrix Eij C lies the element Cjj. 
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Using the equality CEu = EijC-, we find that — cjj (as elements 
in the same positions of equal matrices), which is to say that all 
elements of the principal diagonal of matrix C are equal. On the 
other hand, element cji lies at the intersection of the jth row and 
the ith column of matrix CEi/, but in matrix EuC we have a zero 
at this site (because i ^ /), and therefore cn ^ 0, or every off-diago- 
nal element of matrix C is zero. The theorem is proved. 


16. An Axiomatic Construction 

of the Theory of Determinants 

An nth-order determinant is a number which is uniquely defined 
by a given square matrix of order n. The definition of this concept 
given in Sec. 4 points to a rule by which a determinant can be 
expressed in terms of the elements of the given matrix. This construc- 
tive definition may, however, be replaced by an axiomatic definition. 
In other words, it is possible to point out, among the properties 
of a determinant that were established in Secs. 4 and 6, such proper- 
ties that the determinant is the sole function of a real matrix having 

these properties. . . , . , • . • *-i- • 

The simplest definition of this kind consists in utilizing the 

expansion of a determinant in terms of a row. Let us consider square 

matrices of any order and let us afs““% t jat “"y such matrix ;/ 

is a.ssociated with a number d„ and the following conditions hold 

(!) If the matrix M is of order one, that is, if it consists of 

a single element a, then , . , ... 

(2) If the first row of a matrix M of order n is made up of the 

elements an, Oi. am 

a matrix of order n - I which remains after deleting from M the 
first row and the ilh column, then 

— and.wj — ai'Xwz + — • ■ • (“!)" 


Then for any matrix M, the number is equal to the determinant 
of that matrix We leave it to the reader to carry out the proof of 
this assertion, which is done by induction with respect to n and 

utilizes the results of Sec. 6. , r t . . 

Much more interesting are some other forms of an axiomatic 

definition of a determinant which refer solely to the case of a given 
order n and have for a basis some of the simplest determinant pro- 
perties that were established in Sec. 4. Let us examine one of these 

definitions. , , . , . , , 

Let any square matrix M of order n be associated with a number 

d„, and let the following conditions hold true. 

I. If one of the rows of matrix M is a multiple of k, then the number 

d^t is also a multiple of k. 
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II. The number is not changed if to one of the rows of M we 
add another row of this matrix. 

III. If E is the unit matrix, then d^ = 1. 

We shall prove that for amj matrix M the number d^ is equal ' 
to the determinant of the matrix. 

Let us first derive from the conditions I to III certain properties 
of the number dj,f that are analogous to the corresponding properties 
of a determinant. 

(1) If one of the rows of matrix M consists of zeros, then dv = 0. 

Indeed, by multiplying a row consisting of zeros by the number 0, 
\yc do not change the matrix, but because of Condition I, the number 

acquires the factor 0. Therefore 



(2) The number d^j does not change if to the ith row of matrix M 
we add its jth row, f i, multiplied by a scalar k. • 

If k ~ 0, then that is the proof, if k ^ 0, then we multiply 
the /th row by k and obtain a matrix M' for which, because of 
Condition I, dw = kdr,i. Then to the all row of matrix M' we 
add the /th row and obtain the matrix and, because of Condi- 
tion II, dM" = f/u'. Finally, we multiply the /th row of matrix M'' 
by the scalar A-*. We arrive at matrix Ar, which is actually 
obtained from M by the transformation indicated in the formula- 
tion of the iiroperty being proved; note that 


di/ " A" V.v' = = k"^ • A-c/.v = d.u 

(d) // the rows of matrix M are linearly dependent, then d\f = 0. 

Indeed, if one of the rows, say the ith, is a linear combination 
of the other rows, then, applying transformation (2) several times, 
It IS possible to replace the ith row by a row of zeros. Transforma- 
tion (_) does not change the number dyi and so, by Property (1), 
u\/ ■■ 0. 

(4) If the ith row of matrix .1/ is a sum of two vectors 6 and v and 
if matrices M and M" are obtained from M by replacing its ith row 
by the vectors and y, respectively, then 



Let S be the system of all rows of matrix M, except the ith. 
If there is a linear dependence in S, then the rows of each one of 
the matrices M, M , M" are linearly dependent, and therefore, 
by Properly (3), d^j ~ d\f — — 0, whence in that case 

follows tile truth of the properly being proved. Now if a system S 
consisting of n — 1 vectors is linearly independent, then as the 
results of Sec. 9 show, a vector a may be adjoined to form a maximal 
linearly independent system of vectors of n-dimcnsional vector 
space. It is possible to express the vectors ji and y linearly in terms 
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of this system. Let vector a enter these expressions with the coef- 
ficients k and /, re.spectively; vector a will then enter the expression 
for the vector ft H- y, that i.s. for the iih row of matrix M. with 
the coefficient k + 1. Matrices M, M' and M" can now be transformed 
by subtracting from their ith rows certain linear combinations of 
other rows so tliat the vectors (A- + /) a. ka and lo. will serve respec- 
tively as their iih rows. Therefore, denoting by jl/® the matrix 
obtained from matrix .1/ by replacing its iih row by the vector a 
and taking into account Properties (2) and I. we arrive at the equa- 
tions 

— (A H- 1) d.Mo. dsi’ = Ad.wo, — W.vo 

The proof of Property (4) is complete. 

( 5 ) If matrix Tl is obtained from matrix M by interchanging 
two rows, then d—^ = — dv- 

Suppose it is necessary in matrix .1/ to interchange the rows 
with subscripts i and j. This can be achieved by a chain of trans- 
formations: first add to the ith row of M its jih row and get matrix 
M'; by Condition II. d.M- = d^. Then from the jih row of M' sub- 
tract its ith row and arrive at the matrix M", for which, by Property 
( 2 ), we have dM- = dsf\ the ;th row of M" will differ in sign from 
the ith row of M. Now add to the ith row of M" its yth row. For 
matrix A/'", Nvhich this manipulation yields, we have, by Condi- 
tion II, dsf' = d\i-, and the ith row of this matrix coincides with 
the yth’ro’w of matrix M. Finally, iniMtiplying the ;th row of .1/"' 

by —1, we arrive at the desired matrix M. Therefore, by Condition I, 

d-q= —d.\ir>— —dyf 

(6) If matrix M' is obtained from matrix M by interchanging rows, 

the oLi-th row of matrix M serving as the ith row of matrix M’, i = 
= 1 , 2 n. then 

f/.U' = 

The plus sign corresponds to tfw case when the permutation 

/I 2 . . . n \ 

yCtj ) 

is even', the minus sign, to the case when it is odd. 

indeed, matrix M' may be obtained from matrix M by a number 
of transpositions of two rows, and for this reason we can take advan- 
tage of Property (5). The parity of the number of these transposi- 
tions determines, as we know from Sec. 3, the parity of the above- 
given permutation. 

Now let us consider the matrices M = (a/;), N = (6(y) and 
their product Q = MN in the meaning of Sec. 13. We find the 
number dq. We know that any ith row of matrix Q is the sum of 
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all the rows of jmatrix taken, respectively, with the coefficients 
a-ii, aj 2 , ...» fijn (see, for example. Sec. 14). Replace all the rows 
of Q by their indicated linear expressions in terms of the rows of 
matrix jV and take advantage of Property (4) several times. We 
find that the number dq will equal the sum of the numbers dj for 
all possible matrices T of the following kind: the ith row of T, 
j = 1, 2, . . ., rt, is equal to the a; th row of matrix N multiplied 
by a scalar dia;. Here, because of Property (3), we can disregard 
ail matrices T for which there e.xist subscripts i and /, i ^ such 
that a-i == fXj’, in other words, what remain are only matrices T 
for which the subscripts ai, • • •» ctn constitute an arrangement 
of the numbers 1, 2, . . n. Because of Properties I and (6), the 
number dr for such a matrix is of the form 

dr = ±^101^202 • • ♦ ^hian^N 


■where the sign is determined by the parity of the permutation formed 
from the subscripts. Whence we arrive at the expression for the 
number dqi after factoring the common factor dy out of all summands 
of the type dr, what we obviously have left in the parentheses is the 
determinant | M | of the matrix M in the sense of the constructive 
definition as given in Sec. 4, i.e.. 


dq — I Af 1 ■ cf y 



If we now take the unit matrix E for the matrix N, then Q — A/, 
and, by Property III. ~ d^ = i, that is for any matrix M we 
have the equality 

d,r - I A/ I 


which is what we set out to prove. At the same time, once again, 
and without the use of the Laplace theorem, we have proved the 
multiplication theorem for determinants: all that needs to be done 
is. in equation (*), to replace the numbers dq and by the deter- 
minants of the respective matrices. 

We conclude these axiomatic considerations with proof of the 
independence of Conditions I to III, that is proof that none of 
these conditions is a consequence of the other two. 

To prove the independence of Condition III, assume that d^ ~ 0 
for any matrix M of order n. Conditions I and II will obviously 
be fulfilled, but III breaks down. 

To prove the independence of Condition II assume that for 
any matrix M the number dv is equal to the product of the elements 
in the principal diagonal of the matrix. Conditions I and III are 
fulfilled. Condition II breaks down. 

Finally, to prove the independence of Condition I, assume that 
c/.v = 1 for any matrix M. Conditions II and III will be fulfilled 
hut Condition I fails. 
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17. The System of Complex Numbers 

During the course of elementary algebra the range of numbers 
is expanded several times. The beginning student of algebra brings 
with him from arithmetic a knowledge of positive integers and 
fractions. Algebra actually begins with the introduction of negative 
numbers, thus establishing the first of the important number 
systems, the system of integers, which consists of all the positive 
and all the negative integers and zero, and the broader system of 
rational numbers consisting of all integers and all fractions (both 
positive and negative). 

A further extension of the number realm is the introduction 
of the irrational numbers. The system consisting of all rational and 
all irrational numbers is the system of real numbers. A university 
course of mathematical analysis usually contains a rigorous construc- 
tion of the system of real numbers; however, for our purposes in this 
course the knowledge of the real numbers that the reader has when 
he takes up the study of higher algebra will suffice. 

Finally, at the very end of the course of elementary algebra, 
the system of real numbers is extended to the system of complex 
numbers. Of course this system of numbers is less common than the 
system of real numbers, though actually it possesses many very 
good properties. In this chapter we recapitulate with sufficient 
completeness the theory of complex numbers. 

Ck)mplex numbers are introduced in connection with the following 
problem. We know that the real numbers do not suffice for us to 
solve every quadratic equation with real coefficients. The simplest 
of the quadratics that does not have any roots in the class of real 
numbers is 

+ 1 = 0 (I) 

We will only be interested in this equation for the present. The 
problem confronting us is: to extend the system of real numbers to 
n system of numbers that will supply us with a root for equation (1). 



108 


CH. 4. COMPLEX NUMBERS 


As construction material for this new system of numbers, let 
us take advantage of points in a plane. It will be recalled that the 
depicting of real numbers by points of the straight line (this is 
based on the fact that we obtain a one-to-one correspondence between 
the set of all points of the line and the set of ail real numbers if, for 
a given origin of coordinates and a scale unit, every point of the 
line is associated with an abscissa) is systematically utilized in all 
divisions of mathematics and is so customary that ordinarily, 
we do not make any distinction between a real number and the 
point that depicts it. 

Thus, wc wish to define a system of numbers correlated with all 
points in the plane. Up till now we have not had to add or multiply 
points of a plane, and so we can define the operations involving 
points, taking care only that the new system of numbers should 
possess all the properties intended for it. These definitions, parti- 
cularly for products, will at first appear to be rather artificial. 
In Chapter 10, it will be shown however that no other definitions 
of operations, which at first glance may seem more natural, would 
give us what we want; that is, they would not result in the construc- 
tion of an extension of the system of real numbers containing the 
root of equation (1). It will also be demonstrated there that replacing 
the points of a plane by any other material would not have led 
to a system of numbers whose algebraic properties differ from the 
system of complex numbers which we will construct below. 

We have a plane and we choose a rectangular system of coordi- 
nates. Let us agree to denote points of the plane by the letters 
a, p, Y- • • • write a point a with abscissa a and ordinate b 
as (fl, b), that is, departing somewhat from what is accepted in 
analytic geometry, and write a = (a. b). If we have points a = (cr, 6) 
and p ^ (c, d), then the sum of these points will bo a point with 
ab-scissa a !- c and ordinate b d, or 

(a, b) — (c, d) ^ (fl -- c, b -- d) (2) 

For tlic product of the points a = (fl. b) and p - (c. d) we will have 
the point with abscissa ac — bd ami witii ordinate ad ~ be. or 

{a, b) (c, d) -- (tfc — bd, ad -- be) (3) 

We have thus defined two algebraic operations on the set of 
all points in the plane. We will show that these operations have 
all the basic properties possessed by operations in the system of real 
numbers or in the system of rational numbers; both are commutative 
and associative, connected by the distributive law, and have inverse 
operations^siibtracfion and division {except by zero). 

Commutativity and a.ssociativity of addition are obvious (more 
precisely, they follow from the corresponding properties of the 
addition of real numbers) since in the process of adding points of 
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the plane we separately add their abscissas and their ordinates. 
The coDimutativity of multiplication is based on tlie fact that the 
points a and p enter the definition of a product symmetrically. 
The following equations prove associativity of multiplication: 

l(a, b) {c, d)l (e, /) = (ac — bd. ad + be) {e, f) 

— ^ace — bde — adf — bef, acf ~ bdf + ade + bee). 

{a, b) I(c, d) {e, /)] - (fl, b) {ce - df. cf + de) 

= (ace — adf — bef — bde, acf + ade -f- bee — bdf) 

The distributive law follows from the equations 

\{a, b) + (c, d)\ (c, f) = ia ^ c. b d) {e, f) 

= {ae ee — bf — df, af ef be + dc). 

<a, b) {e, f) ic. d) {e. f) = [ae - bf. af -f be) + (ce - df, cf de) 

= (ae — bf ce — df, af + be + cf + de) 

Let us examine the inverse operations. If we have the points 
a. = (a, b) and p = (c, d), then their difference is a point (.r. y) 

such that 

(c, d) + (x, y) = (a, b) 

Wiience, by (2), 

c X = a, d y = b 

Thus, the difference of the points a = (a, b) and p = (c. d) is the 
point 

a _ p = (fl — c, h — d) (4) 

and this difference is defined in unique fashion. In particular, zero 
is the coordinate origin (0,0); the opposite point of ct = (a, b) 

is the point 

—a = (— «, —b) (5) 

Now, suppose we have the points a = (a, b) and p = (c, d), 
and suppose point p is nonzero; that is, at least one of coordinates c, 
d is nonzero, and therefore, + d^ ^ 0. The quotient of a divided 
by p must be a point (x, y) such that (c, d) (x, y) = (a, b). Whence, 

5>y (3). 

cx ^ dy = a, 
dx -i- cy = b 

Solving this system of equations, we obtain 

be — cd 

c2 + rf2 ’ + 


no 
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Thus, for p=7^0 the quotient exists and is unambiguously defi- 
ned : 


a / flc+6d be — ad \ 
p" — V c2-t-d2 ’ c2 + d2 j 



Assuming P = a, we find that in our multiplication of points unity 
is a point {1, 0) lying on the axis of abscissas at a distance 1 to the 
right of the origin. Also assuming in (6) that a = 1 = (1, 0), 
W'e find that for p =7^ 0, the inverse of p is 





We have thus constructed a system of numbers that can be depicted 
by points in the plane, and the operations on these numbers are 
defined by formulas (2) and (3). This system is called the system 
of complex numbers. 

Let us now show that the system of complex numbers is an extension 
of the system of real numbers. To do this, we consider points lying 
on the axis of abscissas, or points of the form (< 7 , 0); associating 
a real number a with the point {a, 0), we evidently get a one-to-one 
correspondence between the set of points under consideration and 
the set of all the real numbers. Applying to these points formulas 
(2) and (3), we get 

(a, 0) + (6, 0) = (a -f b, 0), 

{a, 0).(6, 0) = (a6, 0) 


i.e., points (a, 0) may be added and multiplied in the same way 
as the corresponding real numbers. Thus, the set of points on the 
axis of abscissas, considered as a part of the system of complex numbers, 
does not differ in its algebraic properties from the system of real numbers 
as ordinarily depicted by points on a straight line. This will enable 
us. in the future, to equate the point {a, 0) and the real number a, 
i.e., we will always assume (a, 0) = a. In particular, zero (0, 0) 
and unity (1, 0) of the system of complex numbers turn out to be 
the real numbers 0 and 1. 

Wo now have to demonstrate that the complex numbers contain 
the root of equation fl), that is, a number whose square is equal 
to the real miniber —1. This is the point (0, 1), i.e., a point lying 
on the axis of ordinates at a distance 1 upwards from the origin. 
Indeed, using (3), we get 

(0. I).(0. 1) = (-1,0) = -1 


Let us agree to denote this point by the letter i, so that z® = — 1. 

Finally, let us show how the customary notation of the complex 
numbers we have constructed can be obtained. First find the product 
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of a real number b and the point i: 

bi = (6, 0).(0, 1) = (0, b) 


This is a point, consequently, which lies on the ordinate axis and 
has ordinate b; all points of the ordinate axis may be represented 
by such products. Now if (a, b) is an arbitrary point, then because 
of the equation 

{a, b) = {a, 0) + (0, b) 


we get 


{a, 6) = fl + bi 


In other words we have arrived at the customary notation of complex 
numbers; the product and sum in the expression a + bi are to 
be understood, of course, in the sense of operations defined in the 
system of complex numbers we have constructed. 

Now that we have constructed the complex numbers, the reader 
will have no difficulty in verifying that all the preceding chapters 
of this book— ihe theory of determinants, the theory of systems of 
linear equations, the theory of the linear dependence of vectors, 
and the theory of matrix operations — over without any restric- 
tions from real numbers to all complex numbers. 

Note, in conclusion, that the foregoing construction of the system 
of complex numbers raises the following question. Is it possible 
to define addition and multiplication of points in three-dimensional 
space so that the collection of these points becomes a system of num- 
bers containing within it the system of complex numbers or at 
least the system of real numbers? This question goes beyond the 
scope of the present text, but the answer is no. 

On the other hand, noting that the addition of complex numbers 
as defined above actually coincides with the addition of vectors 
(in a plane) emanating from a coordinate origin (see following 
section), it is natural to pose the question: is it possible, for a cer- 
tain n, to define the multiplication of vectors in an n-dimensional 
real vector space so that, relative to this multiplication and to 
ordinary addition of vectors, our space proves to be a number system 
containing the system of real numbers? It may be demonstrated 
that this cannot be done if we require the fulfillment of all the proper- 
ties of the operations which are valid in the systems of rational, 
real and complex numbers. However, if we reject commutativity 
of multiplication, then such a construction is possible in four-dimen- 
sional space; the resulting system of numbers is called the system 
of quaternions. A similar construction is also possible in eight- 
dimensional space. This yields what is called the system of Cayley 
numbers. In this case, however, we have to give up not only the 
commutativity of multiplication but also associativity, and replace 
the latter by a weaker requirement. 
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18. A DeeiJer Look at Complex Numbers 

In keeping with historically evolved traditions, we call the 
complex number i the imaginary unit, and numbers of the form 
hi, pure imaginaries, although we have no doubt about the existence 
of such numbers and we can indicate points of the plane (points 
on the axis of ordinates) which depict these numbers. In the complex 
notation of the number a, as a = a -1- hi, the a is called the real part 
of a and hi is called its imaginary part. A plane with points identified 
with complex numbers as indicated in Sec. 17 is called the complex 
plane. The axis of abscissas (x-axis) is called the axis of reals since 
its points depict the real numbers, and the axis of ordinates (y-axis) 
of the complex plane is termed the axis of imaginaries. 

The addition, multiplication, subtraction and division of complex 
numbers written in the form a hi are performed in the following 
manner, as follows from formulas (2), (4), (3) and (6) of the preceding 

cnp t inn • 

(^7 -f- hi) 4- (c -f- di) (a ^ c) {b + d) 7, 

(77 -- hi) — {c ~ di) = (a — c) -i- {b — d) i. 

{a hi) (c -r di) — {ac — bd) {ad -f- bc)i, 

a-~bi ac-{-bd , be — nd , 
c + di ~ c2 + d2 " cZ-r-d-: * 

In the addition of complex numbers, add separately the real parts and 
the imaginary parts. Similarly for subtraction. The formulas for 
multiplication and division would be too involved if given verbally. 
The la.st formula need not be memorized; simply bear in mind that 
it may be derived by multiplying the numerator and denominator 
of the given fraction by a number different from the denominator 
solely in the sign of the imaginary part. Indeed, 

a-}-bi {a-\-lii){c — di) {ac -\~ bd) ■]- {be — ad)i ac-\-bd , be — ad. 

c 4-rf"i " (c 'rdi){c — di) 

Examples. 

<1) (2 4 50 ^ (I - 70 = (2 -L 1) 4 (5 - 7) i = 3 ~ 2t. 

(2) (3 - 90 - (7 4 0 = (3 — 7) 4 (-9 — 1) /= -4 — lOi. 

(3) (1 4 2o (3 - 0 -= U -3 - 2 (—1)1 4 U-(-l) 4 2-31 i = 5 4 5^- 

23-L-i (234 0(3 - 0 70- 207 
3 ‘ i 0 19 

The portrayal of complex numbers as points in a plane result 
in a natural desire to have a geometric interpretation of the opera- 
tions involving complex numbers. For addition, this interpretation 
is simple. Suppose we have the numbers a — a bi and p = c -j- di. 
Join the corresponding points {a, b) and {c, d) with line segments 
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to the origin and construct a parallelogram on these segments, 
as sides, as shown in Fig. 2. The fourth vertex of the parallelogram 
will obviously be the point (a + c, b + d). Thus, the addition 
of complex numbers geometrically is accomplished in accord with 
the parallelogram rule, which is to say by the rule of addition of vectors 
emanating from the coordinate origin. Also, the number opposite 
to a — a bi is a point in the complex 
plane that is symmetric to a about the origin 
(Fig. 3). This gives the geometric interpre- 
tation of subtraction. 

The geometric meaning of multiplica- 
tion and division of complex numbers will 
become clear only after we introduce a new 
notation for them that differs from that 
used heretofore. The notation of a as 
a — a bi makes use of the Cartesian 
coordinates of a point corresponding to that 
number. However, the position of a point 

in the plane is also completely defined by specifying its polar coor- 
dinates: the distance of r from the origin to the point and the angle 
(p between the positive x-axis(axis of abscissas) and the direction 
from the origin to the point (Fig. 4). 

The number r is a nonnegative real number which is zero only 
at the point 0. For a .on the real axis (that is to say, for a a real 



Reals 


Fig. 2 


Im 



Im 



number), the number r is the absolute value of a; for this reason, 
for any complex number a, the number r is sometimes called the 
absolute value of a; more often, however, the number r is called 
the modulus of the number a and is denoted by | a |. 

The;angle (p is called i\iQ argument oi the number a and is denoted 
by arg a Iwe thus dispense with the customary names of the polar 
coordinates of a point: the radius vector and the polar (or vectorial) 
anglol. The angle (p can take on any real values (positive or nega- 
tive), the positive angles being reckoned counterclockwise. But 
if the angles di0er by 2n or a multiple of 2n, then the points they 
depict in the plane will be coincident. 


8-066 
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Thus, the argument of a complex number a has an infinity 
of values differing by integral multiples of the number 2n; from 
the equality of two complex numbers specified by their moduli 
and arguments one can only conclude, consequently, that the argu- 
ments diSer by an integral multiple of 2ji, whereas the moduli are 
the same. It is only for the number 0 that the argument is not defined. 
However, this number is fully determined by the equation j 0 | = 0. 

The argument of a complex number is a natural generalization 
of the sign of a real number. The argument of a positive real number 
is zero, the argument of a negative real number is Ji. There are 
only two directions out of the origin on the axis of reals and they 
may be distinguished by two symbols: + and — . Now in the complex 
plane, there are infinitely many directions issuing from the point 0, 
and they differ in the angle formed with the positive direction of 
the real axis. 

The Cartesian and polar coordinates of a point are connected 
by the following relation which holds true for any position of points 
in the plane: 

a = r cos fp, b = r sin (f> (1) 

Whence 

r = + (2> 

Let us apply formulas (1) to an arbitrary complex number 

rjr ~ 0 hi: 

rx = a bi = r cos (p -f- (r .sin (p) i 
or 

a — r (cos (p -r i sin (| ) (3) 

Conversely, let the number a — a bi admit a notation of the 
form a (cos (po + i sin To), where Tq and cpo are certain real 
numbers and fq >0. Then ryCOsTo = sin To = whence 

rn +1' a-f />% that is, by (2), ro=ia |. Whence, using (1), we 
got cos Til -- T’ To = T’ To — ‘'^rg a. Thus, any complex 

number a is uniquely defined by (3), where r = | a j, t = arga 
(the argument (p ‘>f course defined only to w'ithin multiples 

of 2.1). This notation of the number a is called the trigonometric 
form and will be used very often in the sequel. 

The numbers 

., / .1 ... .1 \ a i9 ... 19 

a = o (cos isin j , p = cos i -f i sin y .i 

and 

T = I S [cos ( -- f ) 4 i sin ( -^) ] 
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are given in trigonometric form; here |a| = 3, tp| = l, 1^1 = 

o 19 -T / o ji 13 \ 

arg a = -^, argp = argv= — — ^orarg^^ — , argY--n) . 

On the other hand, the complex numbers 
a' — { — 2} (cos-^ + isiny) , p' = 3 (cos jt — ^ sin y ji| . 


V' = 2(cos + i sin y Jtj , 6' = siny .t + i cos-|- ji 


are not given in trigonometric form, allhoiigli their notations resem- 
ble that of (3). In trigonometric form, these numbers look like 

a' = 2 (cos 1“ -f * sin = ^ (cns y jx-f i sin y n| , 



7 , . . 7 

-T-.n-l-zsin-T- 

•j 4 


.1 


Finding the trigonometric form of a number v' involves difficulties 
that are almost always encountered when passing from the customary 
notation of a complex number to its trigonometric notation and 
vice versa: with the exception of a few cases, it is impossible to 
find the exact angle on the ba.sis of given numerical values of the 
sine and cosine, and it is impossible for a given angle to write the 
exact values of its sine and cosine. 

Let the complex numbers a and [1 be given in trigonometric 
form: a ~ r (cos (p + i sin <p), = r' (cos<P' -|- i sin (p'). Multi- 

plying these numbers together, we get 

ap = |r (cos (p -{- i sin (cos (f' + i sin ip')) 


— r/ (cos rp cos (p' + i cos (p sin ' -i- i sin (p cos (p'— sin (psin cp') 


or 


ap = rr' (cos (fp • <]’') i sin ((p xp')) 

We have the product ap written in trigonometric form 
1 ap 1 = rr' or 

I "aP I - i ct I I p I 



( 4 ) 

so 



In words, the modulus of a producl of complex numbers is equal to the 
product of the moduli of the factors. .Mso, arg (ap) = rp -|- (p' or 

arg (aP) ^ arg a + arg p (6) 


The argument of a product of complex numbers is equal to the sum 
of the arguments of the factors (iioio that equality here means to within 
a multiple of 2ji). These rules obviously carry over to any finite 
number of factors. As applied to real numbers, formula (5) yields 
the familiar property of absolute values of the numbers, and 
(6), as can readily be verified, turns into the rule of signs in the 
multiplication of real numbers. 


8* 
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Analogous rules are valid in the case of a quotient. Indeed, let 
a = r (cos fp + i sin qj), P = (cos q)' + ( sin q) ), p =j^ 0; that is 
r' 0. Then 

a r (cos * siin (f) r (cos q-|-^ si n g) (cos g* — i sin 

~f ^ r' (cosq' + i siuq^ “ r' (cos^ q' --sinS q') 

= ^ (cos q cos q' -f i siu T cos q' — i cos q sin q' -j-sin qsin q') 


or 


P 


Whence it follows that 


a 

T 


p_q') 4 -isin (q-q')! 

( 7 ) 

11 

o 


a la| 

P IPI 

( 8 ) 


The modulus of a quotient of two complex numbers is equal to the modu- 
lus of the dividend divided by the modulus of the divisor. Also, arg 
— (p — q' or 

arg ( - j ) = ‘'"'g f rg f> (9) 

The argument of a quotient of two complex numbers is obtained by 
subtracting the argument of the divisor from the argument of the dividend. 



It nol (liflicnU tmw to grasp the geometric meaning of inulti- 
plication and division, liecau.^e of (o) and (ti).wegeta point depicting 
the product of tin* nuinhera hv the miinher p r' (cos q'-fisinq ) 
if the vector from 0 to a (Fig. 5) is rotated counterclockwise through 
an angle q' -- argP and then stretched hy a factor r' ^ | p I (for 
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0 ^ r' <; 1 it will be a compression instead of a dilation). Also, 
from (7) it follows that for a = r (cos tp + i sin (p) 0 we have 

0,-1 = r"* [cos (— (p) + i 


i.e., I a-M = I a |-^ arg (a"*) = -arg a. We Uius obtain point 
if from point a we go to point <x at a distance r ' from zero on the 
same half-line emanating from zero as is point a (Fig. 6),* and then 
go to a point symmetric to a' about the real axis. 

A sum and difference of complex numbers given in trigonometric 
form cannot be expressed by formulas similar to (4) and (7). However, 
for the modulus of a sum we have the following important inequa- 

1 0S * 

|a|_ |pK|a + M<l“l + IM 

In words, the modulus of a sum of two complex numbers is less than 
or equal to the sum of the moduli of the terms but greater than or equal 
to the difference of these moduli. Inequalities (11) lollow from the 
familiar theorem of elementary geometry concerning the sides of 
a triangle because | a + P I is. as we know, equal to the diagonal 
of a parallelogram with .sides | a |- and | p |. Incidentally, the case 
for points a, P and 0 lying on one straight line requires a special 
investigation, which we leave to the reader It is only in this case 
that the equalities are attained m formulas (11). 

From (11), because a — p = cc + (— p) and 


(this equation follow.s at the very least from the geometric inter- 
pretation of the number -fi), al.'^o follow the inequalities 


That is, the same inequalities hold for the modulus of a difference 

as for the modulus of a sum. , ,, 

Inequalities (11) might he obtained in the following manner 

Let a = r (cos q. + i sin if), p = r (cos<P + i sinT ) and let 
the trigonometric form of the number a + p be a + p = 
= R (cosiji -!- i sin y^). Adding the real and imaginary parts separa- 
tely, wo obtain 

r cos (p + i ' cos (p = /? cos i}), 
r sin (p H r' sin (p = /f sin il? 


• la'I = I a 1 i( and only if |a = I, I 'at is d the point ci lies on the 
cireum erence of Ihe noil rlrde. If a l.« mside the unit circle, then a will be 
outside it, and vice verso. In this way we obviously obtain a one-to-one cor- 
respondence between all points nl ti.e compicx piano outside tl.c unit circle 
anu all nonzero poiiil.s witldn the umt circle. 
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Multiplying both sides of the first equation by cos ij) and both sides 
of the second by sin and then adding, we get 


r(cos (p cos + sin fp sin ij)) + r' (cos q)' cos ijj + sin cp' sin 


That is, 


= R (cos^ + sin^ \}>) 


r cos ((p — i|j) -f- r' cos (cp' ^ yp) = R 


Whence, since the cosine is never greater than unity, follows the 

ineqiialily r -f- r' > /?. or | a | ^ I M > I « ^ P I- On the other 
hand. « = (cc -f- P) — p “ (ct -- P) (— p), whence, by what has 
been proved and by virtue of (12). 

I a K 1 a -■ P I -r I -p 1 = I a -f p M- 1 p [ 

From (his. \a \ — |p|^|a — p|. 

It is well to note that for complex numbers the concepts of 
“more than and "less than” cannot he reasonably defined because 
these numbers, in contrast to the real numbers, are not located 
on a straiglit line, who.se points are naturally ordered, but in a plane. 

For this reason, complex numbers as such {not 
their moduli) can never be connected by an 
inequality sign. 

Conjugate numbers. Suppose we have 
a complex number a ~ a ^ bi. The number 
a — bi, which differs from a solely in the 
sign in front of the imaginary part, is called 

the conjugate of a and is denoted by a. 

It will be recalled that when considering 
the division of complex numbers we resorted 
to conjugate numbers but did not introduce 
tliat term. 

The conjugate number of a is obviously 
cc; in other words, we can speak of a pair of 
conjugate numbers. The real numbers are the only numbers which 
ore conjugate to them.selves. 

Geometrically, conjugate numbers are points symmetric about 
the real axis (Fig. 7). Whence follow the equations 

I a I I a I, arg a — arg a (14) 


Im 



Fig. 7 


The sum and product of conjugate complex numbers arereal numbers. 
Indeed. 



O', a 


(15) 
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The last equation shows that the number aa is positive even 
for a =5^ 0. In Sec. 24 we will derive a theorem which shows that 
the property proved here is characteristic of conjugate numbers. 
The equation 

{a — bi) {c — di) = {a c) — {b d) i 

shows that the conjugate of a sum of two numbers is equal to the sum 
of the conjugates of the numbers: 

a T (i — a f) (Ifi) 

Similarly, from the equation 

(a — bi) {c — di) = {ac — bd) — {ad -f be) i 

it follows that the conjugate of a product is equal to the product of the 
conjugates of the factors: 

afi = a*p (!') 

Direct verification also shows the following formulas to be valid: 

^ = (18) 

(19) 

We will now prove the following assertion; if a number a is in 
some way expressed in terms of the complex numbers i, . . ., [5,, 
by means of addition, multiplication, subtraction and division, then 
by replacing all the numbers in this expression by their conjugates, 
we obtain the conjugate of a; in particular, if a is a real number, it 
does not change when all the complex numbers are replaced by 

their conjugates. , 

We shall prove this assertion by means of induction with respect 

to n, since for w 2 it follows from formulas (16)-(19). 

Let the number a be expressed by the numbers p,, pj, .... p„ 
not necessarily distinct. This expression gives a definite order in 
which the operations of addition, multiplication, subtraction and 
division are applied. The last step will be to apply one of these opera- 
tions to the number Yi expressed in terms of the numbers Pj, pj, . . . 
Pfe -where 1 < /c < « — 1- number expressed 

in' terras of the numbers pft+, P«- By the induction hypothesis, 

replacBment of the numbers Pi> Pa by their conj^ugates 

implies a replacement of the number Vi by the number Vi, and 
a replacement of the numbers pA + i)_P/(+2t ■ • Pn by their conju- 
gates implies substitution of 73 by 73* However, by one^f the for- 
mulas (16)-(19), the transition from 7, and 73 to 71 and 73 converts 

the number a to a. 
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19. Taking Roots of Complex Numbers 


Let us now examine the raising of complex numbers to a power 
and the taking of roots. To raise a number a = a + 6i to a positive 
integral power n, it suffices to apply Newton’s binomial theorem 
to the expression {a -f- b/)" (this formula holds true for complex 
numbers as well, since its proof is based solely on the distributive 
law) and then take advantage of the equations = —1, js — 
i* = 1, whence, generally, 

£•*'‘=1, = j4h+2_ £4^+3= i 


U a number a is given in trigonometric form, then for a positive 
integral n, there follows from (4) of Sec. 18 the following formula 
called De Moivre's formula: 

[r (cos (p £ sin q)!" = r” (cos ncp + £ sin n(p) (1) 

In raising a complex number to a power, raise the modulus to that power 
and multiply the argument by the exponent. Formula (1) holds true 
for negative integral exponents as well. Indeed, since a“" = 
it is sufficient to apply the De Moivre formula to the number 
the trigonometric form of which is given by (10), Sec. 18. 

Examples. 

( 1 ) P22 =_l, 


(2) (2 4- 5:)^ = 23 -L 3 -2-51 + 

= 8 + GOi - 150 - I25t = -142 — 65/. 

(3) [^1/2 (cnsj + ' siii-|)J'' = (y2)*{cosn + tsin.n)=> -4. 

(4) [3(cos| + <'^in^)j"' 

= 3-3 [^cos (-1 .T)+( sin --t) (cos-Iji + isinin) . 

A special case of De .Moivre’s formula, namely, the equation 

(cos (p 4- £ sin q')" = cos nqp 4- £ sin nep 


permits finding with ease formulas for the sine and cosine of a mul- 
tiple angle. Indeed, expanding the left member of this- equation 
by the binomial formula and equating the real and imaginary parts 
of both sides separately, we obtain 


cusmp-cos"(f- ( " j cos"-2(p.sin2(p-p ( ” ) cos"-*(p.sin"q) 

4 

Sin mp= ( " ) cos"-i<r.sin 'p— ( 3 ) cos’'^^ (p.sj^a 



cos"'^ cpsin^/p— . . . 
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Here, is the usual notation for a binomial coefficient: 

/ n \ n(n— I){n— 2) ...(n — Ar-f-l) 

\kl 1-2.3... ft 

For n — 2 vfe arrive at the familiar formulas 

cos 2(p = cos‘ (p — sin“ (p, 
sin 2(p = 2 cos <p sin cp 

and for n = 3 we obtain the formulas 


cos 3(p = co.«^ (p — 3 cos (p sin^ tp, 
sill 3fp = 3 COS' cp sin q) — sin® (p 


Extracting roots of complex numbers is a far more difficult task. 
Let us start with the square root of the number a = o -f* As yet 
we do not know whether there exists a complex number whose 
square is equal to a. Let us assume that such a number u -f- vi 
exists; that is, using conventional symbols, we can write 


From the equation 
it follow.s that 


[/a -r = u + vi 


(u vi)’ = a + 6i 

u- — v^ = 0 , 1 

2«r — J 



Squaring both sides of each of the equations of (2) and then adding, 
we get 

(^2 _ y2)2 +4«V = (u- + v^-)^ = + 6- 

whence 

^ ^ ^ y a~ -r 6 - 

The plus sign is taken because the numbers u and v are real -and 
therefore the left member of the equation is positive. From this 
equation and from the first of the equations of (2), we get 

{ (a + , 

Thus, extracting the square roots we get two values for u which 
differ in sign and also two values for v. All these values will boreal 
since the square roots are extracted from positive numbers for any 
and b. The values obtained for u and v cannot be combined in arbi- 
trary fashion, since, by the second equation of (2), the sign of the 
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product uv must coincide with the sign of b. This yields two possible 
combinations of values of u and v, that is, two numbers of the form 
ii ^ ijl which can serve as values of the square root of the number a; 
these numbers differ in sign. An elementary though unwieldy check 
(squaring the resulting numbers separately for the case 6 > 0 and 
b <C 0) shows that the numbers we found are indeed the values of the 
square root of the number a. Thus, taking the square root of a com- 
plex number is always possible and yields two values which differ in sign. 

In particular, it now becomes possible to extract the square root 
•of a negative real number; the values of this r oot will be pure ima- 

ginaries. Indeed, if a <; 0 and b = 0, then K a“ b^ = — a, since 
this root must be positive, but then u- = -1 (a — a) = 0, that 

is, u = 0, whence a — ±i;i. 

Example. Let a = 21 — 20i. Then Va- +T= = y4'il + 400 = 29. There- 
fore, u-= -Ir (21 + 29) = 25, i.’= = (-21 + 29) = 4, whence u = ± 5, 

i; = ± 2. The signs of u and v must be different since b is negative, therefore 

y21 — 207 = ±(5 — 2i) 


Attempts to e.xtracl higher (than second) roots of complex num- 
bers given in the form a bi encounter insuperable difficulties. 
Thus, if we wished to extract the cube root of a number a -f bi, 
we would first have to solve some auxiliary cubic equation, which 
we arc as yet unable to do, and which in turn would require, as we 
shall see in Sec. 38, the extraction of the cube root of a complex 
number. On the other hand, the trigonometric form is extremely well 
suited to extracting roots of any degree. Using the trigonometric 
form we will now exhaust this problem completely. 

Let it be required to extract the /ith root of a number a =* 
— r (cos fp -r i sin fp). Let us assume that this is possible and that 
we get the number p (cos 0 -|- i sin 6), that is 

|(> (cos G -f- i sin 0)]“ — r (cos cp -i- i sin (p) (3) 

Then, by Do Moivre’s formula, p" = r, that 'is p — |/^r, where 
the right member contains a uniquely determined positive value 
of the «th root of the positive real number r. On the other hand, the 
argument of the left member of (3) is n0. We cannot assert, however, 
that nQ is equal to (p, since these angles may actually differ by some 
integral multiple of 2 .t. Therefore, /id = (p -}- 2A.m, where k is an 
integer, whence 

cp-i-2/..T 

n 


Conversely, if we take the number r (cos 


(p -f- 2A:.’i 


n 


i sin ) t 

n f 


then for any integral k, positive or negative, the nth power of 
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this number is equal io a. Thus 

^ r (cos (p 4- i sin (p) = (cos ^ -r i sin /f | 

Assigning different values to A*, we will not always get distinct 
values of the required root. Indeed, for 


k = 0 , 1 . 2 . 


n - 1 


(5) 


we get n values of the root, all distinct, since increasing k by unity 

implies increasing the argument by Now let k be arbitrary. 

If k = nq r, 0 ^ r ^ n — 1. then 

if>-\-2kn. ^ if + 2(nq-\-r)n ^ q: + 2r.-T ^ 

vk k* mm ‘ J 


In other words, the value of the argument for our k differs from the 
value of the argument for A* = r by a multiple of 2 r. We thus obtain 
the same value of the root as for the value of k equal to r, that is, 
such as lies in the set (5). 

Thus, extracting the nth root of a complex number a is always 
possible and yields n distinct values. All values of the nth root lie on 

a circle of radius ^ \ a\ with centre at zero and divide the circle into 
n equal parts. 

In particular, the wth root of a real number a also has n distinct 
values, of which two, one, or none will be real, depending on the 
sign of a and the parity of n. 


Examples. 


0) 


P= )/^2 (cos Ji + isin n) =5'2( 


J.1 + 2ICR 

cos 5 f-J 




/,- = 0: Po=> - (cos-^-f-isin-^j : 

A-=i: pi = v 2 (co.s~.T + isinJ^.T| 

k = 2: p^=^"2 (cos + • 


<2) P 


cos r h* sin 


i + 2kn 


Po = cos 


1// = j/ cos y+i sin-^ ^ 

-^+t;sin + p, = cos|-n-}-/sin|-nc=_po, 


m P 


3/- t'Tr: n —- — T ol a + 2A.T . . n-f-2An\ 

= y—S=>f 8{cosn+Ismn)=2 ^cos f-' sin — — J ; 


p(, = 2 (cos-^ + Ism-^j = l + i 1/3; 

pj = 2{cosR + Isin n)= —2; 

p 2 = 2 (cos -^ + i sin =1— f 1/3. 
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Roots of unity. Of particular importance is the case of extracting 
the nth root of unity. This root has n values, and, because of the 
equation 1 = cos 0 -f i sin 0, and formula (4), all these values or, 
as we shall say, all the nth roots of unity, are given by the formula 


v^i = 


cos 


2kK . . . 2kn 

1- ism , 

n n 


= 1, . . n — 1 



The real values of the nth root of unity are obtained from formula (6) 

for the values A: = 0, and — , if n is even, and for fc = 0 if n is odd. 

In the complex plane, the nth roots of unity are located on the cir- 
cumference of the unit circle and divide it into n equal arcs: one 
of the division points is the number 1. From this it follows that 
those of the nth roots of unity which are not real are situated sym- 
metrically about the real axis (that is, are pairwise conjugate). 

The square root of unity has two values: 1 and —1; the 
fourth root of unity has foiir values: 1, — 1, i and i. It is 
advisable for what follows to memorize the values of the cube 

root of unity. By (6), the roots are cos ^ 4- i sin ^ , where k = 

= 0, 1, 2; that is, besides unity, the conjugate numbers 



2n 


= cos-3- 


isin 



An 


An 


e') = COS-:r4 isin = 




% 


as well. . , 

All values of the nth root of a complex number a may be obtained 
by multiplying one of these values by all the nth roots of unity. Indeed, 
let p be one of the values of the nth root of the number a, i.e., 
R” " a and lot e be an arbitrary value of the nth root of unity, that 
is, e" = 1. Then (pe)" = = a. Thus pe is also one of the 

values for /'a. Multiplying p by each of the nth roots of unity, we 
n distinct values of the nth root of the number a, that is, all 
the values of this root. 


Example 1. One of the vahios of the_cube root of —8 is —2. The two oilers 
are. by (7), the nunibers -2 e,= 1 -iVS and -2e2=l-l-* (-'ce E.vample 3 

above). 

Example 2. \ 81 has four values: 3, — 3, 3i, —3/. 


The product of two nth roots of unity is itself an nth root of unity. 
Indeed, if e'* = 1 and p" = 1, then (eq)’* = = 1- Also, the 

reciprocal of an nth root of unity is itself that root. Let e" = 1- Then 
from e-E-* - 1 it follows that e^Te-*)” 1. that is, (s"^)" = 1- 

Generally, any power of the nth root of unity is also an nth root of 
u nity. 
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Any kih root of unity will also be an /th root of unity for any I 
that is a multiple of k. Whence it follows that if we regard the entire 
collection of nth roots of unity, then some of these roots will already 
be n'-th roots of unity for some n’ which are divisors of the number n. 
However, for any n, there exist nth roots of unity such that they 
are not any lesser roots of unity. These roots are termed primitive 
nth roots of unity. Their existence follows from formula (6): if the 
value of a root corresponding to a given value of k is denoted by 
<so that eo = 1), then on the basis of De Moivre’s formula (1), 

= 

Thus, no power of Ci less than the nth will be equal to 1, that is 

€, = cos — + / sin - is a primitive root. 

* ^ ^ 

An nth root e of unity is a primitive nth root if and only if its powers 
k — Oy iy . . n — 1, are distinct, that is, if they exhaust all 

the nth roots of unity. . 

Indeed, if all the indicated powers of the number e are distinct, 

then e is obviously an nth primitive root. But if, for example, e = 
= fovO <k < I — iy then e'’** = 1; that is, because of the 
inequalitieri < / - A: ^ n - 1. the root e will not be primitive. 

The number Ei found above is not, in the general case, the only 
primitive nth root. The following theorem is used to find all of these 

wots. , , , I, . 

If E is a primitive nth root of unity, then the number e is a pri- 
mitive nth root if and only if k is relatively prime to n. 

Let d be the largest common divisor of the numbers k and n. 

If d > 1 and A: = dk', n = dn , then 

that is, the root e** is an n'-th root of unity. 

On the other hand, let d = 1 and at the same time let the number 

c** be an mth root of unity, 1 ^ m < n. Thus, 

(^h)m == fhm = \ 

Since the number e is a primitive nth root of unity, that is, only its 
powers with exponents that are multiples of n can be equal to unity, 
it follows that the number km is a multiple of n. But since 1 ^ m < 
< n,. the numbers k and n cannot be relatively prime; this con- 
tradicts the assumption. ... , , 

Thus the number of primitive nth roots of unity is equal to the 

number of positive integers k less than n and relatively prime to n. 
The expression for this number, which is ordinarily denoted by 
<p (n), may be found in any course of number theory. 

If p is a prime number, then all these roots except unity it.self 
will be primitive pth roots of unity. On the other hand, i and — i 
(not 1 and —1) will be among the primitive fourth roots of unity. 


CHAPTER 5 


POLYNOMIALS 
AND THEIR ROOTS 


20. Operations on Polynomials 


The content of the first two chapters of this book— the theory 
of determinants and the theory of systems of linear equations— 
grew out of the elementary school course of algebra which proceeds 
from one equation of the first degree in one unknown to systems of 
two and three equations of the first degree in two and three unknowns 
respectively. The second branch of elementary algebra, which in that 
setting appeared to be the more important one, consisted in passing 
from first-degree equations in one unknown to an arbitrary quadratic 
equation again in one unknown, and on to certain special types 
of equations of the third and fourth degree. This trend is further 
developed into a very extensive and rich branch of higher algebra 
devoted to the study of arbitrary equation.? of the «th degree in one 
unknown. This division of algebra, which is historically the earlier 
one, is treated in the present chapter and in some of the later chap- 
ters of this te.\t. 

The general form of an /ith-degree equation {n a positive inte- 
ger) is 




n 



U,.r 


n-i 










The coefficients Cq, Uj, . . of this equation will be 

considered to be arbitrary complex numbers and the leading coef- 
ficient flo must be nonzero. 

If an equation like (1) is written, it is assumed that we have to 
solve it. In other words, we have to find numerical values for the 
unknown x that satisfy the equation, that is, values, which, when 
substituted in place of the unknown and after all indicated opera- 
tions have been carried out, reduce the left member of (1) to zero. 

However, it is advisable to replace the problem of solving equa- 
tion (1) by the more general one of studying the left member of this 
equation; 




( 2 ) 
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which is called a polynomial of degree n in the unknown x. Remember 
that only expressions like (2) are polynomials, that is, only the sum 
of integral nonnegalive powers of the unknown x taken with certain 
numerical coefficients, and not just any sum of monomials, as was 
the case in elementary algebra. In particular, we will not consider 
as polynomials expressions which contain negative or fractional 

powers of the unknown x, such as 2z- — - Z or ax~^ + bx~'^ + 

£ 

+ cx~^ d ex + fx^ or x- 4 - 1 . For brevity, we will denote 
polynomials by the symbols / (x), g (x). 9 (x), and so on. 

Two polynomials / (x) and g (x) will be considered equal (or 
identically equal), f (x) = g (x), only when the coefficients of like 
powers of the unknown are equal. To be specific, no polynomial can 
be equal to zero if at least one coefficient is nonzero and for this 
reason, the equality sign used in the notation (1) of an Hth-degree 
equation has no connection with the above-defined equality of poly- 
nomials. The ^ sign connecting polynomials will always be under- 
stood in the sense of an identical equality of these polynomials. 

Thus, we look upon the /ith-degree polynomial ( 2 ) as a certain 
formal expression, fully defined by the set of its coefficients Uo. 
ai, . . ., a„, where 00=5^ 0 . Tlie e.xact meaning of these words will 
bo explained in Chapter 10 . Nolo that aside from the notation of 
a polynomial given in (2) (in descending powers of the unknown x), 
we may use other notations obtainable from (2) by a rearrangement 
of the terms, say, in ascending powers of the unknown. 

There is of course the po.ssibilily of regarding the polynomial 
(2) from the viewpoint of mathematical analysis and of considering 
it to be a complex function of a complex variable x. However, we 
have to bear in mind that two functions are considered equal if 
their values for all values of the variable x are equal. It is clear 
that two polynomials which are equal in the above-mentioned formal 
algebraic sense will also be equal as functions of x. The converse 
will be proved only in Sec. 24 however. After that the algebraic 
and function-theoretic viewpoints on the concept of a polynomial 
with numerical coefficients will indeed be equivalent. For the time 
being, however, each time we hav(* to indicate precisely which sense 
is meant. In the present section and the two following sections we 
will look upon the polynomial as a formal-algebraic cxpre.«sion. 

Naturally, there are /ah-degree polynomials for any natural 
number n. We consider all possible polynomials of this kind: first- 
degree (or linear), quadratic, cubic, etc. We will also encounter 
polynomials of degree zero, which are nonzero complex numbers. The 
number zero will al.«o be taken to be a polynomial. This is the 
only polynomial who.se degree is not defined. 

For polynomials with complex coefficients we now define the ope- 
rations of addition and multiplication. These operations will be 
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introduced using the pattern of operations involving polynomials 
with real coefficients, which are familiar from the course of ele- 
mentary algebra. 

If we are given polynomials / (x) and g (x) with complex coef- 
ficients (written, for convenience, in ascending powers of x): 

/ (x) (Iq “1“ “ 6f/i_iX ^ -)“ ^l/iX , dfi 0, 

g (x) = 6o + -r . . . -r ^0 

and if, for example, n s, tlien their sum is the polynomial 

/ (x) -r g (x) = Co -f c,x -f . . . -f- c„_ix"-^ -h c„x” 

whose coefncients are obtained by adding the coefficients of the 
polynomials / (x) and g (x) of like powers of the unknown, i.e., 

c. = a. -J- i = 0, i, . . n (3) 

For n > 5, the coefficients hg+i, fo be taken equal 

to zero. The degree of the sum will be equal to n if n is greater than 
s, but for « — s it may accidentally prove less than n, namely, 
when b,t = — (in- 

The product of polynomials / (x) and g (x) is the polynomial 

/ (x)-g (x) = t/o -r t/,x — . . . -r 4- 


whose coeflicients are determined as follows: 

di— 3 ^i<bi, i=0, 1 n-fs— 1, i'i) 

h~i^i 

riiiit is. the coefficient is the result of imilliplying those coeffi- 
cients of llie polyriomials / (x) and g (x) whose sum of indices is 
equal to i and of adding all such products; in particular, do — 

(^obo, dx =- fly/q -r dj^^^ = a,, 6s. From the latter 

tMiuality follows the inequality + and therefore tlie degree 

of the product of two putijnomials is equal to the sum of the degrees 
of these poliinomials. 

From this il fidlows that the product of polynomials different 
from zero can neuer be equal to zero. 

What propertie.s do these operations that we have introduced 
for polynotiiials have? The commutative and associative laws for 
iuldition follow immediately from the validity of these properties 
for addition of numbers, since we add the coefficients of each power 
of the unknown .separately. Subtraction is possible: the role of zero 
i.'j played by the number zero, whicli we have included in the class 
of polynomials, and the oj)posite of / (x) will be the polynomial 

-/ (x) ^ - —do — (lyr — ... — a„-ix”“* — a„x” 

The commutative law for multiplication follows from the com- 
mutativity of multiplication of numbers and from the fact that 
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in the definition of a product of polynomials, the coefficients of 
both factors / (a:) and g (x) are of an equal status. The associativity 
of multiplication is proved as follows: if besides the above-written 
polynomials / (x) and g (x), we are given the polynomial 

h (x) = Co "T -r . . • "T " C/.i'‘. ct ^0 

then the coefficient of x*. / = 0. 1, .... n -j- s — in tlie product 
[/ (x) g (x)l h (x) is tlie number 

and in the product /(a ) I/? (x) A (t)| Uie equivalent number 

Finally, the validity of the distrilnilive law follows from the 
equation 

S (a/i H- c/ = S 2 Vi 

h-rl=i h+l^i 

since the left-hand member of this equation is the coefficient of x‘ 
in the polynomial 1/ (x) + g (x)l h (x) and tlie right-hand member 
is the coefficient of the same power of the unknown in the poly- 
nomial / (x) h (x) + g (x) h (x). 

It will be noted in the multiplication of polynomials that the 
role of unity is played by 1, which is regarded as a polynomial of 
degree zero. On the other hand, a polynomial f (x) has an inverse 

r {x), 

/(x)/-Mx) = l (5) 

if and only if f (x) is a polynomial of degree zero. Indeed, if / (x) is 
a nonzero number a, then the inverse polynomial is the number a~K 
But if / (x) has degree « > 1, then the degree of the left side of (5) 
would not be less than n if the polynomial f~^ (x) existed, whereas 
the polynomial on the right is a polynomial of degree zero. 

Consequently, the multiplication of polynomials has no inverse 
operation (division). In this respect, tlie .set of all polynomials with 
complex coefficients resembles the set of all integers. The analogy 
may be continued in that polynomials, like the integers, have 
a division algorithm (with remainder). Elementary algebra describes 
this algorithm for the case of polynomials with real coefficients. 
However, since we are dealing with polynomials with complex 
coefficients, it is well to review once again all the statements and 
to carry out the proofs. 

For any two polynomials f (x) and g (x) we can find polynomials 
q (x) and r (x) such that 

/ (x) = g (x) q(x) + r (x) (fi) 


9-080 
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the degree of r (a:) being less than the degree of g (a:), or r (x) ~ 0. 
The polynomials q (x) ‘and r (x) satisfying this condition are defined 

uniquely. 

Let us first prove the latter half of the theorem. Let there also 
be polynomials 7 (a:) and F (x) such that likewise satisfy the equation 

f (x) -= g (x) q (x) + r (x) (7) 

the dee:ree of r (x) again being less than the degree of g (x)*. Equa- 
ting the right sides of (6) and (7), we obtain 

g (x) [g (x) — q (x)l = /• (x) - r (x) 

The degree of the right side of this equation is less than the degree 
of g (x), but the degree of the left side would be greater than or equal 

to the degree of g (x) for q (x) —q (x) 0. ^Therefore, it must be 

true that q {x) — q (x) = 0, that is, q {x) = q (x), but then r (x) = 

= F(x), which is what we set out to prove. 

We now prove the first part of the theorem. Let the polynomials 
/ (x) and g (x) have degrees n and s, respectively. If n ■< 5, then we 
can put q (x) = 0, r (x) = / (x). But if n > s, then we take advantage 
of the same method by which in elementary algebra we divide 
polynomials with real coefficients (in descending powers of the 
unknown). Suppose 

/ (x) = aix"“’ -h . . . + a„_ix -i- On, Cfo ¥=0. 

g (.r) = box^ bix*~' -h . . . ¥ bs^iX -f- b^, &o ¥0 
Setting 

= ( 8 ) 

we get a polynomial whose degree is less than n. Denote this degree 
by III and the leading coefficient of the polynomial /| (x) by Oiq. 
Now, if we still have n^^s, set 

/.(x)-^«x"i-g(x) = /,(x) (8J 

Denoting by n.^ the degree and by Ooo the leading coefficient of the 
polynomial f^ (x), we set 

= ( 8 ,) 

and so forth. 

Since the degree.s of the polynomials fi (x), /« (x), . . . decrease, 

n > > ^2 > . • •• we finally arrive (after a finite number of steps) 

at the polynomial fh (x). 

• Or r(x) = 0. This case will not be specifically stated in the sequel. 
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the degree of which, is less than s. Our procedure has come to 
a halt. Now adding (8), (8i), . . (S*.,), we get 

Thus, the polynomials 
r (x) = /ft (:r) 

do indeed satisfy (6), and the degree of r (x) is in fact less than the 
degree of g (x). 

Note that the polynomial q (x) is called the quotient obtained 
from the division of / (x) by g (x), and r (x) is the remainder. 

From this consideration of the division algorithm, it is easy 
to establish that if / (x) and g (x) are polynomials with real coefficients, 
then the coefficients of all polynomials /i (x), /j (x), . . . and therefore 
also the coefficients of the quotient q (x) and the remainder r (x) will 
he real. 

21 . Divisors. Greatest Common Divisor 

Suppose we have nonzero polynomials / (x) and (x) with com- 
plex coefficients. If the remainder after dividing / (a:) by (p (x) is 
zero, we then say that / (x) is divisible {exactly divisible) by q) (x). 
Here, the polynomial ip (x) is called the divisor of the polynomial 
/ (x). 

The polynomial 9 (x) is a divisor of the polynomial f (x) if and 
only if there exists a polynomial i|) (x) such that satisfies the equation 

fix) = (p(x)l|7(x) (1) 

Indeed, if (p (x) is a divisor of / (x), then for (x) we should take 
the quotient of / (x) divided by ip (x). Conversely, let there be a poly- 
nomial ijj (x) which satisfies ( 1 ). From the proof given in the pre- 
ceding section on the uniqueness of the polynomials q (x) and r (x) 
which satisfy the equalionj 

/ (x) = rp (x)q(x) -f r (x) 

and the condition that the degree of r (x) be less [than the degree 
of tp (x), it follows in our case that the quotient of f (x) by 9 (x) 
is equal to (x), and the remainder is zero. 

Naturally, if equation (1) holds, then tj) (x) is also a divisor 
of / (x). Furthermore, it is obvious that the degree of 9 (x) does not 
exceed the degree of / (x). 

Note that if the polynomial / (x) and its divisor 9 (x) both have 
rational or real coefficients, then the polynomial (x) as well will 


9 * 
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have rational or, respectively, real coefficients since it is 
by means of the division algorithm. Of course, a polynomial with 
rational or real coefficients can also have divisors, not all the coef- 
ficients of which are rational (or real). This is shown for example 

by the equation 

+ 1 - (x - 0 (X - 0 

We indicate a few basic properties of divisibility of polynomials 

that will be very useful later on. ^ t / n .i, „ 

I. If f (x) is divisible by g (x), and g (x) is divisible by h (x), then 

Since, by hypothesis, f (x) = g (x) (p (x) and g {x) — k (x) ij) (x), 

it follows that f (x) = h (x) lil) (x) (p (x)l. 

II. If f (x) and g (x) are divisible by (p (x), then their sum ana 

difference are also divisible by {x). v , s j / ^ 

Indeed, from the equations / (x) = (p (x) iji (x) and g (x) - 

= (x) X ix) it follows that f {x) ± g (x) = (p (x) W ± X Wi- 

III. If f (x) is divisible by (p (x), then the product of f (x) by any 

polynomial g (x) is also divisible by (p (x). . . „ * 

True enough, if / (x) = cp (x) if) (x), then it follows that 

/ (x) g (x) = cp (x) [if) (x) g (x)]. 

From II and III we have the following property. 

IV. If each of the polynomials fi (x), /g (x), . • /h (x) w dim- 
sible by (p (x), then the following polynomial will also be divisible 

by (p (x): 

fi (x) gi (x) 4- h (^) ^2 (x) + . - . -- fk (x) gk (x) 

where gi (x), (x), . . ., gj* (x) are arbitrary polynomials. 

V. Any polynomial f (x) is divisible by any polynomial of degree 

Indeed, if / (x) = aox” + flix”"' + • • • + and c is an arbit- 
rary number not equal to zero, that] is, an arbitrary polynomial of 
degree zero, then 

VI. If f (x) is divisible by (p (x), then f (x) is divisible by cqj (x) 
as well, where c is an arbitrary number different from zero. 

From the equation / (x) = (p (x) ip (x) follows the equation 

f (x) = \c^> (x)l • (x)l. 

vn. The polynomials cf (x), c^O, and only such polynomials 
are divisors of the polynomial f (x) that have the same degree as f (x). 
Indeed, / (x) = \cf (x)l, or / (x) is divisible by cf (x). 

If, on the other hand, / (x) is divisible by cp (x), and the degrees 
of / (x) and (x) coincide, then the degree of the quotient of / (x) 
by (p (x) must be zero, i. e., / (x) = d(p (x), d^Ii, whence cp (x) = 

= d-V (x). 
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From this we get the following property. 

VIII. The polynomials f {x). g {x) are simultaneously divisible 

one by the other if and only if g (x) = cf (x), c ^ 0. 

Finally, from VIII and I we get the property 

IX. Any divisor of one of two polynomials f (x)^ cf [x). where 
c ^ 0, is a divisor of the other polynomial as well. 

Greatest common divisor. Suppose we have arbitrary polyno- 
mials / (x) and g (i). The polynomial ip (i) is called the common 
divisor o\ f (x) and g (z) if it is a divisor of each of them. Property \ 
(see above) shows that the common divisors of the polynomials 
/ (it) and g (it) include all polynomials of degree zero If there are 
no other common divisors of these two polynomials, then the poly- 

But in the general case, the polynomials f (x) and g (it) ma> have 
divisors which depend on it; we wish to introduce the concept of the 

greaiesi common divisor of these polynomials. , 1 ,.., 

It would be inconvenient to take a definition stating that the 
greatei?t common divisor of the pol^iomials / (it) and g (it) is their 
common divisor of highest degree. On the one hand, as yet we do 
not know whether / (x) and g (x) have many different common 
divisors of highest degree which differ not only in a zero-degree 
factor In other words, isn’t this definition oo indeterminate? 
On the other hand, the reader will recall from elementary arithme- 
tic the problem of finding the greatest common divisor ‘"^“2 
and 'lUci that tho srcatest common divisor 6 of the integers 12 
and 18 is not only the greatest among the common divisors of these 
numLrs but is even divisible by any other of their comnmn dm- 
sors; the other common divisors of 12 and 18 are 1, 2, 3, 1, _, 

~\^t'is whv for polynomials, we have the folloiying definition 

jZ la Z comlon divisor of the nonzero polynomials / (x) 
and g (x) is a polynomial d (x), which is their common divisor and, 
^ hv nnv other common divisor of those poly- 

nomialV The pes^oi^mordivisor of the polynomials / (x) and 

^ ^"ThL^d^nUi^n leaves^ open the qu<?slion of whether there exiMs 
a greatest common divisor of any polynomials / (x and g (^.) We 
will now answer this question in the affirmatne. At the same time 
we will give a practill method for finding the greatest common 
divisor of the given polynomials. Quite naturally we cannot carry 
over the procedure used for finding the greatest common divi.sor 
of integers since we do not as yet have anything analogous in poly- 
nomials to’ the decomposition of an integer into a product of prune 
factors. However, for integers there is also .inother method called 
the algorithm of successive division, or Euclid s algorithm, fills pro- 
cedure is quite applicable to polynomials. 
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Euclid’s algorithm for polynomials consists in the following. 
Let there be given the polynomials / (x) and g (x). We divide / (x) 
by g (x) and obtain, generally speaking, a remainder ri (x). Then 
divide g (x) by rj (x) and get a remainder r2 (x), divide rj (x) by 
To (x) and so on. Since the degrees of the remainders decrease con- 
tinuously, there will come a time in this sequence of divisions when 
the division is exact and the procedure terminates. The remainder 
r^ (x) which divides exactly the preceding remainder r^-i (x) is the 
greatest common divisor of the polynomials f (x) and g (x). 

By way of proof, let us write the contents of the preceding para- 
graph in the form of a chain of equations: 

/ (a-) = g (x) qi (x) + rj (x), 
g (x) = ri (x) q. (x) -f- r, (x), 

(^) = r. (x) 73 (x) + Ta (x), 

rh_3 (x) = (x) 7h_i (x) + rft_, (x), 

(j-) = (x) 7ft (^) + Tft (x), 

^;.-i (^) = Tft (x) 7ft+, (x) 

The last equation shows that Tft (x) is a divisor of Tft.i (x). 
Whence it follows that both terms of the right member of the second 
last equation are divisible by r^ (x) and so r)^ (x) is also a divisor 
of Tft.o (x). Rising upwards in this fashion, we find that r^ (x) is 
also a divisor of rft_3 (x), . . ., Tj (x), Tj (x). Whence, by virtue 
of the second equation, it will follow that Tft (x) is a divisor of g (x) 
and tiierefore, on the basis of the first equation, a divisor of / (x) 
as well. Thus, rf^ (x) is a common divisor of / (x) and g (x). 

Now let us lake an arbitrary common divisor (p (x) of the poly- 
nomials / (x) and g (x). Since the left side and the first term of the 
right side of the first of the equations (2) are divisible by qp (x), 
it follows that rj (x) is also divisible by cp (x). Passing to the second 
and successive equations, we find in the same way that the polyno- 
mials r2(x), r3{x), . . . are divisible by qp (x). Finally, if it is 
proved that Tft.n (x) and r^.i (x) are divisible by qp (x), then from 
the second last equation we find that r^ (x) is divisible by (p (x). 
Thus, Tft (x) is indeed the greatest common divisor of / (x) and g (x). 

We have thus proved that any two polynomials have a greatest 
common divisor, and we have a procedure for computing it. This 
method shows that if the polynomials f (x) and g (x) both have rational 
or real coefficients, then the coefficients of their greatest common divisor 
will also be rational or real, though of course these polynomials 
may also have other divisors, not all coefficients of which are rational 
(real). Thus, the polynomials wdth rational coefficients 

/ (x) = x^ — Sx^ — 2 x + 6 , g (x) = X® -f — 2 x — 2 
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hav 0 as greatest cominon divisor the polynomial with rational coef- 
ficients — 2, though they have a common divisor x — not 
all the coefficients of which are rational. 

If d (x) is the greatest cominon divisor of the polynomials / (j) 
and g(x), then, as Properties VIII and IX {see above) show, for 
the greatest common divisor of these polynomials we could also 
choose the polynomial cd (x), where c is an arbitrary' number difie- 
rent from zero. In other words, /he greatest comrrwn divisor of two 
polynomials is only determined to within a factor of degree zero. In ^ iew 
of this fact we can agree that the loading coefficient of the greatest 
common divisor of two polynomials will always be considered equal 
to unity. Using this condition, we can say that two polynomials are 
relatively prime if and only if their greatest common divisor is unity. 
Indeed for the greatest common divisor of two relatively prime 
polynomials we can lake any number different from zero; but mul- 
tiplying it by the inverse, we got unity. 


Example. Find the greatest common divisor of the polynomials 
/ ( 2 ) = x^-\- - 4x - 3, S W = 3^® + + 2x - 3 

Applying Euclid’s algorithm to polynomials with integral coefheients. 
we can (to avoid fractional coefficients) multiply the dividend or reduce the 
divisor bv any nonzero number (this may he done cilher at the ^tart or at any 
other lime in the division). Quite naturally, tins will disloit the quotient, 
but ihl remainders that interest us will only acquire some factor of zero degree, 
which as we know is quite permissible when seeking the greatest common divi- 

We divide f (x) by g (i) but first multiply / (x) by 3: 

J-i i 

3x3 + 10 x 2 - 1 - Ilx- 3 I 3x1 + 9x3 - 3x2 - 12x- 9 

3xi-^10x3-|-2x2-3x 

_x3-rjx2-9x-9 


(multiply by —3) 


3x3 15^2 4 - 27x + 27 

3x3+1 0x^-1- 2 t — 3 
5x'" + 25x + 3U 


Thus, the first remainder, after dividing by 5. will be r, (i) = x= + 5i + 6. 
We divide the polynomial g (x) by it: 

3x - .') 


x2+5x + C 3x3+10x2 + 2x-3 
^ .3x3 -r 15x2 + 18x 


-5x2-lUx-3 

-.5x2-2.';x-30 

9x + 27 

The second remainder, after dividing by 9. is thus rj (i) = i + 3. Since 

n (x) = rz (x) (x + 2) 
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it follows that (x) will be the last remainder which exactly divides the prece- 
ding remainder. It will consequently be the desired greatest common divisor: 

(/(x). g(x)) = xH-3 

We use the Euclidean algorithm to prove the following theorem. 
If d (i) is the greatest common divisor of the polynomials f (a:) 
and g (x), then it is possible to find polynomials u (x) and v (x) such that 

f (x) u (a-) -f g (x) V (x) = d (x) (3) 

If the degrees of the polynomials f (x) and g (x) exceed zero, we can 
then take it that the degree of u (x) is less than the degree of g (x), and 
the degree of v (x) is less than the degree of f (x). 

The proof rests on the equations (2). If we take into considera- 
tion that r,, (x) = d (x) and if we put Wi (x) = 1, Vi (x) = —q^ (x), 

I hen the second last of the equations (2) yields 

(x) (x) (x) -r r,,., (x) i\ (x) 

SubstiliUing the' e.xpression] (x) in terms of (x) and (x) 
from the preceding equation (2), we get 

d (x) -- (x) u. (x) -i- rh _2 (x) i\ (x) 

whore, obviously, ^.(x) = I'l (x), v. (x) - (x) — i.', (x) qn-i (x). 

Coiitinning upwards through the equations of (2), we finally arrive 
at the equation (3) being proved. 

To prove the second assertion of the theorem, assume that the 
polynomials u (x) and v (x) which satisfy (3) have already been 
found, but that, say, the degree of u (x) is greater than or equal to 
tlie degree of g (x). Divide u (x) by g (x): 

u (x) ^ g (x) q (x) -h r (x) 

where the degree of r (x) is less than the degree of g (x), and substitute 
this expression into (3). We get the equation 

/ (x) r (x) g (x) In (x) -r f (x) q {x)] = d (x) 

The degree of tlie factor of / (x) is now less than tlie degree of g (x). 
The degree of the polynomial in square brackets will in turn be 
less than the degree of / (x), since otherwise the degree of the second 
summand in the lefl-liand member would not be less than the degree 
of tlie jiroduct g (x) / (x). and since the degree of the first summand 
is less than the degree of this product, the entire left side would 
have a degree greater than or equal to the degree of g (x) / (x), whe- 
reas tfie polynomial d (x) is definitely (given our assumptions) of 
lower degree. 

This j)roves (lie theorem. .-\t the same time we sec that if the 
polynomials / (.r) and g (x) have rational or real coefficients, then 
we can also choose tlie polynomials u (x) and v (x), which satisfy 
(3), so that their coefficients arc rational or real. 
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Bxample. Find the polynomials u {x) and u (ar) which’satisfy (3) ‘for 
/ (z) = I* — z- + Sz — 10, g (z) = z^ -ff6z-^— j^9z — ,14 

Apply Euclid’s algorithm to these polynomials. This time, when perfor- 
ming the divisions, we cannot allow for any distortion of the quotients since 
these quotients are used to find the polynomials u (z) and v(z). We obtain 
the following system of equations: 


/(z)=g(z)-l-(-7z2+12z+4), 

( 1 ?4v OQS 

-7z2 -h 12z + 4 = (z - 2) (-7z - 2) 


Whence it follows that (/ (z), g (z)) = z — 2 and that 



Applying the above-proved theorem to relatively prime polyno- 
mials, we get the following result. 

The polynomials / (x) and g (x) are relatively prime if and only 
if it is possible to find polynomials u (x) and v (x) such that satisfy 
the equation 

f (x) u{x) + g (x) y (x) = 1 ( 4 ) 

Proceeding from this result, we can prove a number of simple 
but important theorems on relatively prime polynomials: 

(a) If a polynomial f (x) is relatively prime to each of the polyno- 
mials 9 (x) and (x), then it is also relatively prime to their product. 

Indeed, by (4), there arc polynomials u (x) and v (x) such that 

/ (x) u (x) 9 (x) y (x) = 1 

Multiplying this equation by 9 W' 

/ (x) [u (x) 9 (x)J + l 9 W ^ (^)I = 9 U) 

whence it follows that any common divisor / (x) and 9 (x) 9 (x) 
would also be a divisor of 9 (x); however, it is given that 

(fix), 9(x)) = 1. 

(b) If the product of the polynomials f (x) and g (x) is divisible by 
9 (x) but f (x) and 9 (x) are relatively prime, then g (x) is divisible 
by 9 (x). 

This is true since by multiplying the equation 

/ (x) u (x) + (x) (x) = 1 

by g (x), we get 

[/ (x) g (x)] u (x) + 9 (^) (^) s (x)] - g (x) 
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Both terras of the left-hand member of this equation are divisible 
by (p (x); hence g (x) is divisible by 9 (x). 

(c) If the polynomial f (x) is divisible by each of the polynomials 
(p (x) and (x), which are relatively prime ^ then f (x) is also divisible 
by their product. 

Indeed, / (x) = <p (x) 9 (x) so that the product on the right is 
divisible by 'ij^(x). Therefore, by (b), 9 (x) is divisible by \}) (x), 

9 (x) = 1 }) (x) 9 (x), whence / (x) = (9 (x) 9 (x)l 9 (x). 

The definition of greatest common divisor may be extended to 
the case of any finite system of polynomials: the greatest common 
divisor of the polynomials fi (x), (x), . . (x) is that common 

divisor of these polynomials which is divisible by any other com- 
mon divisor of these polynomials. The existence of a greatest common 
divisor for any finite system of polynomials is a consequence of the 
following theorem, which also provides a procedure for calculat- 


ing it. 

The greatest common divisor of the polynomials f\ (x), /o (x), . . . 
. . /s (x) is equal to the greatest common divisor of the polynomial 
fg (x) and the greatest common divisor of the polynomials fi (x), f^ (x), . . • 


* * * * / $~“ 1 * 

Indeed, for s = 2 the theorem is obvious. We thus assume that 
for the case s — 1 it holds true, that is, in particular, we have already 
proved the existence of the greatest common divisor d (x) of the 


polynomials /j (x), /2 (x). . . fs-i (x). Denote by d (x) the grea- 
test common divi.sor of the polynomials d {x) and (x). It will 
obviously be a common divisor of all the given polynomials. On the 
other hand, any other common divijsor of these polynomials will 


also bo a divisor of d (x) and, for this reason, of d (x) as well. 

In particular, the system of polynomials fi (x), /j (x), . . • 
. . .. /s (x) is called relatively prime if only zero-degree polyno- 
mials are the common divisors of the.se polynomiahs; that is to say, 
if their greatest common divisor is unity. If s > 2, then these poly- 
nomiahs may not be pairwise relatively prime. Thus, the system 
of polynomials 


/ (x) = x=^ — 7x' -f 7x -1- 15, g (x) — x^ — X — 20, 



is relatively prime, although 

(/ (x), g (x)) = X - 5, (/ (x), h (.r)) = X - 3, {g (x), k (x)) =x-[-4 

The reader will readily obtain a generalization of the above- 
proved theorems (a) to (c) on relatively prime polynomials to the 
case of any finite number of polynomials. 
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22. Roots of Polynomials 

We have already (Sec. 20) dealt with the values of a polynomial 
when we spoke of the function-theoretic approach to the concept 
of a polynomial. Let us recall the definition. 

If 

/ (x) = + Oix"-^ . . . + (1) 

is some polynomial and c is a number, then the number 

/ (c) = Oflc" + aic^-^ + . . . + 

obtained by replacing in (1) the unknown x by the number c and 
by subsequent performance of all indicated operations, is called 
the value of the polynomial / (x) for x = c. Quite naturally, if / (x) = 
= g (x) in the sense of an algebraic equality of polynomials as 
defined in Sec. 20, then f (c) — g (c) for any c. 

It is also easy to see that if 

(p (x) = / (x) + g (x), (x) = / (x) g (x) 

then 

(pic) = f (c) + g (c), (c) = / (c) g (c) 

In other words, the addition and multiplication of polynomials 
defined in Sec. 20 become— from the function-theoretic approach 
to polynomials— the addition and multiplication of functions, to be 
understood in the sense of addition and multiplication of the appro- 
priate values of these functions. 

If / (c) — 0, that is, the polynomial / (x) vanishes when the 
number c is substituted in place of the unknown, then c is termed 
a root of the polynomial / (x) [or of the equation / (x) = 01. It will 
now be shown that this concept applies completely to the theory 
of divisibility of polynomials, which was the topic of discussion 
in the preceding section. 

If we divide the polynomial / (x) by an arbitrary polynomial 
of degree one (or, as we shall say from now on, by a linear polynomial), 
then the remainder will either be a polynomial of degree zero, or 
zero, which is to say some number r. The following theorem allows 
us to find this remainder without performing the division itself 
when we divide by a polynomial of the form x — c. 

I The remainder resulting from the division of a polynomial f (x) 
a, linear polynomial x — c is equal to the value f (c) of f (j) for 
X = c. 

Let 

/ (x) = (x - c) g (x) + r 

Taking the values of both sides of this equation when ar = c, we get 

f{c) = {c-c)q{c)'\-r = r 

which proves the theorem. 
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An exceedingly important corollary follows from this fact. 

The number c is a root of the polynomial f (x) if and only if f {x} 

is divisible by x — c. . , 

On the other hand, if / (x) is divisible by some linear po ynomia 

flx + b, then evidently it is also divisible by the polynomial 

that is. by a polynomial of the form a: — c. Thus, 
fuiding the roots of a polynomial f (a:) is equivalent to finding its linear 

foregoing, it is of interest to examine the method 
of dividing a polynomial / (x) by a linear binomial x -- c, which 
is simpler than the general algorithm for dividing polynomials. 
This method is called the Horner method. Let 

( 2 ) 


/ (x) = “ ^2*^ 




• • 


— a 


and let 


where 


/(.r) = (x-c)g(x)H-r 

q (x) = 6oa’””* bjx” " "r bnX^ ^ T • • • t bn-i 

Comparing the coefficients of like powers of x in (3), we get 

Oq = 6o, 

= bi — cbQ, 

- ^2 — cb^, 


(3) 


a 


a„-i — hn^i cbn^n. 

an = r — cbn-i 

From this it follows that bo — Oq. b^ = cbh-i -f a^, k — 1. 2, . . • 
n — i, that is, the coefficient bh is obtained by multiplying 
tlie ’preceding coefheient b^-i by c and by adding the corresponding 
coefficient ( 7 ,; finally, r = + a„, that is, the remainder r. 

which as we know is equal to / (c), is also obtained by the same 
rule. Thus, the remainder and the coefficients of the quotient may be 
successively obtained by computations of the same type, which 
can be arranged in a scheme, as the following examples demonstrate. 

KMimpk' I. Diviiie / (j) — j-* — 3x* x — 3 by x — 3. . 

Form an array in which the cuefticients of the polynomial / (z) are locale 
above llie bar, anil the corresponding coefficients of the quotient and the rema- 
inder (computed successively) are located below the bar; on the left is the value 

of «• ill the given example: 

- _1 —3 0 i —3 

i = r).3-a-3= 12.3-l2-i-0 = 3ti.3-3ti-i-l=109.3-lU9-3=^i 
Thus, the desired quotient will be 

q (x) = 2x4 5x3 j- 12x2 . L 36 x -F 109 

and the remainder will bo r=/(3) = 324. 
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Example 2. Divide j {x)=iX^—^3^^x--Y^x—^ by i-rl* 


1 

1 -8 

1 4 

-9 

1 

1 -9 

10 —6 

— 3 


The quotient will therefore be 

?(z) = j: 3- 9x2-1-101— 6 

and the remainder r=i/{— 1)=— 3 . 

Tliese examples show that the Horner method may also be used 
for quick coTnputa.tion of the value of a polynomial for a given value 
of the unknown. 

Multiple roots. If c is a root of the polynomial / (2), i.e., / (c) = 
= 0 , then / (x) is, as we know, divisible by x — c. It may turn out 
that the polynomial / (x) is not only divisible by the lirst power 
of the linear binomial x — c. but by higher powers of it as well. 
In any case, there will be a natural number k such that / (x) is exact- 
ly divisible by (x - c)\ but is not divisible by (x-c)‘"k 

Therefore, 

/ {x) = {x - cf (p (x) 

where the polynomial 9 (x) i.s no longer divisible by x — c, that 
is, does not have c as its root. The number k is called the multiplicity 
of the root c in the polynomial / (x), and the root c is the k-fold root 
of this polynomial. If k = 1, then we say that the root c is simple. 

The concept of a multiple root is closely related to the concept 
of the derivative of a polynomial. However, wo are studying poly- 
nomials with any complex coefficients and for this reason we cannot 
simply take advantage of the concept of a derivative as introduced 
in the course of mathematical analysis. What follows is to be regar- 
ded as a definition of the derivative of a polynomial which is inde- 
pendent of that given in the cour.se of analysis. 

Suppose we have an nth-degree polynomial 

/ (x) - H- -I- . . . H- + On 

with arbitrary complex coefficients. Its derivative {first derivative) 
is a polynomial of degree n — 1: 

f‘ (x) = noQX^'^ -i- (« — 1) + • - • + ^^n-ox -f 

The derivative of a polynomial of degree zero and the derivative 
of zero are taken to be equal to zero. The derivative of the first 
derivative is called the second derivative of the polynomial / (x) 
and is denoted by /' (x), etc. It is obvious that 

j(n) (j-) = nSa^ 

and therefore /<"+*> (x) = 0; i.e., the {n -f l)th derivative of a poly- 
nomial of degree n is equal to zero. 
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In our case of polynomials with complex coefficients, we cannot 
make use of the properties of a derivative as proved in the course 
of analysis for polynomials with real coefficients; we have to prove 
these properties once again using the definition of a derivative given 
above. We are interested in the following properties, which are 
called formulas for differentiating a sum and a product: 

(/ (x) + g {x)Y = r {x) + g' {x) (4) 

(/ (^))' = / (^) g' {x) -r f (x) g (x), (5) 

These formulas can easily be verified, incidentally, by direct 
computation, by taking for f (x) and g (x) two arbitrary polynomials 
and applying the above definition of a derivative; we leave this 
to the reader. 

Formula (5) can readily be extended to the case of a product 
of any finite number of factors and therefore we can in the ordinary 
fashion derive a formula for the derivative of a power 

ijUx)y ^ (x) r (x) (6) 

Our aim will be to prove the following theorem. 

If the number c is a k-fold root of the polynomial f (x), then for 

k X it will be the {k — \)-fold root of the first derivative of this 

polynomial] but if k = i, then c will not be a root of f (x). 

Let 

/ (x) = (x — c)** (p (x), A- > 1 (7) 

where cp (x) is no longer divisible by x — c. Differentiating equa- 
tion^ (7), we get 

/' (x) = (x - cf (p' (x) + k{x~ (p (x) 

= (x — |(x — c) (p' (x) + k^ (x)I 

The first term of the sura in the square brackets is divisible by 
X — c, the second is not divisible by x — c; therefore, the whole 
sum is not divisible by x — c. Taking into account that the quotient 
of / (x) by (x — is uniquely defined, we find that (x — c)''"* 
is the highest power of the binomial x — c which divides the poly- 
nomial /' (x). The proof is complete. 

Applying this theorem several times, we find that the k-fold 
root of polynomial f (x) is the {k — s)-fold root in the sth derivative 
of this polynomial [k ^ s) and for the first time will not be a root of the 
kill derivative of f (x). 

23. Fundamental Theorem 

In examining the roots of polynomials in the preceding section 
wo did not pose the question of whether every polynomial possesses 
roots. We know that there are polynomials with real coefficients 
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that do not have real roots; x" + 1 is such a polynomial. It might 
be expected that there are polynomials which do not have roots 
even in the class of complex numbers, particularly if we consider 
polynomials with arbitrary complex coefhcients. If this were the 
case, then the system of complex numbers would require a further 
extension. Actually, however, the following fundamental theorem 
of the algebra of complex numbers is valid. 

Every polynomial of degree at least one with arbitrary numerical 
coefficients has at least one root, which in the general case is complex. 

This theorem is one of the greatest attainments of the whole 
of mathematics and finds application in the most diverse spheres 
of science. In particular, it is the starting point of everything in the 
theory of polynomials with numerical coefficients and for this 
reason it was once called (and sometimes still is) the “fundamental 
theorem of higher algebra”. Actually, however, the fundamental 
theorem is not purely algebraic. All its proofs— and since Gauss 
first proved the theorem at the end of the eighteenth century a very 
large number have been found— are forced, in one degree or another, 
to make use of the so-called topological properties of the real and 
complex numbers, that is properties associated with continuity. 

In the proof which we now give, the polynomial / (x) with com- 
plex coefficients will be regarded as a complex function of a complex 
variable x. Thus, x can assume any complex values, or, taking 
into account the mode of constructing complex numbers given 
in Sec. 17, the variable x ranges over the complex plane. The values 
of the function / (x) will also be complex numbers. We may consider 
that these values are plotted on a second complex plane, as in the 
case of real functions of a real variable where the values of the 
independent variable are plotted on one number line (axis of abscis- 
sas) while the values of the function are plotted on the other line 
(axis of ordinates). 

The definition of a continuous function as given in the course 
of mathematical analysis is carried over to functions of a complex 
variable (in the formulation of the definition, absolute values are 
replaced by moduli). 

Namely, the complex function / (x) of a complex variable x is 
continuous at a point xo if for any positive real number e there is 
0 positive real number 6 such that no matter what (generally speak- 
pgi complex) the increment h, the modulus of which satisfies the 
inequality | A | < 6, the inequality 

I / (xo + 10 - / (xo) I< e 

fields true. A function / (x) is called continuous if it is continuous 
nt all points xo at which it is defined, that is, if / (x) is a polynomial 
on the entire complex plane. 
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The polynomial f (x) is a continuous function of the complex vari- 
able X. 

The proof of this theorem could be given as it is in the course 
of mathematical analysis, namely, by showing that the sum and 
the product of coulinuous functions are themselves continuous and 
then noting that a function which is constantly equal to one and 
the same complex number is continuous. However, we shall take 
a different approach. 

We first prove the particular case of the theorem when the con- 
stant term of the polynomial / (x) is zero; and we will only prove 
the continuity of / (x) at the point xo = 0. In other words, we will 
prove the following lemma (in place of h we write x). 

Lemma 1. If the constant term of the polynomial f (x) is zer$ 

f (^) = -f - -t- a^-iX 

that is, f (0) = 0, then for any e > 0 there is a 6 > 0 suck that for 
all X for which | x | < 6 it is true that j / (x) | < e. 

Indeed, let 

A = max (I flo | fli ! ^n-i I) 


We are already given the number e. Let us sliow that if for the num- 
ber 6 we take 



then it will satisfy the required conditions. 
Indeed, 


I / (^) I < I ^0 I U r -i- Ui I U I ■ ‘ . -r I fln-i 


n-i - 


(ix r -r ix I) 


that is, 




Since | x | < 6 and, by (1), 6 < 1, it follows that 


x\-\x\nH^^ |x| 


l-\x{ 


l-|z| 


and therefore 




.1 ~f 


which completes the proof. 

Let us now derive the following formula. Suppose we have the 
polynomial 

/(x) = rtoJ'"'T • • • -T 
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with arbitrary complex coefficients. Substitute in place of x the 
sum X h, where h is the second unknown. Using the binomial 
theorem, expand each of the powers (x + A)'‘, k ^n, in the right- 
hand member and collect terms with like powers of h. This yields 
(as the reader can readily verify) the equation 

t(x+h)=f (X) + hr (x)+-!^r(x)+ (x) 

In other words, we prove Taylor's formula, which gives the expan- 
sion of / (x + h) in powers of the “increment” h. 

The continuity of an arbitrary polynomial / fx) at any point xq 
is now proved as follows. By Taylor’s formula, 

/ (xo + fe) — / (xo) = cji H- Cg/i® -I- ... -I- = <p (/^) 

where 

^1 = / (Xq)) ^2 = / (-^o)* •■•I (Xfl) 

The polynomial (p {h) in the unknown ^ is a polynomial without 
a constant term, and so, by Lemma 1, for any e > 0 there is a 6 > 0 
such that for | A | <; 5 it is true that | (p (^) 1 < c, i.e., 

|/(xo + /0-/(xo) I<e 

which completes the proof. 

From the inequality 

II / (xo + I - I / (xo) II ^ I / (xo + h) - f (xo) 1 

based on formula (13), Sec. 18, and from the continuity, just proved, 
of a polynomial there follows the continuity of the modulus | / (x) j 
of the polynomial / (x); this modulus is obviously a real nonnegative 
function of the complex variable x. 

We shall now prove the lemmas that are used in the proof of the 
fundamental theorem. 

Lemma on the modulus of the highest-degree term. If we have 
an nth-degree polynomial, /i ^ 1, 

f (x) = rtox" + aix"-^ -i- flax"-" + . . . -f 

with arbitrary complex coefficients and if k is any positive real number 
then for sufficiently large {in modulus) values of the unknown x the 
inequality 

I flox" I > * I flix"-* + a2x"~= + I ^2) 

is true, that is, the modulus of the highest-degree term is greater than 
the modulus of the sum of all the remaining terms', it is an arbitrary 
number of times greater. ^ 


lO-BSC 
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Indeed, let A be the largest of the moduli of the coefficients 

dot • • • t ^71 * 

A = max (j fli I, I flo I. • • •» I I) 


Then (see, in Sec. 18, the properties of the moduli of a sum and 
a product of complex numbers) 

+ . . . + flft I ^ I 1 1 


a,x 


jx -r 


- i - 1 fl . x "'2 


+ . . . + I a „ I < ( I X r * + 1 X I "-2 -t- . . . + 1 ) = ^ 


Assuming | a: | > 1, we get 

l|n— 1 


whence 


z|-l 


< 


x\n 


I — 1 


xjn 


I f a„ | < -d 


Thus, inequality (2) will be fulfilled if x satisfies the condition 
[ X 1 > 1 and also the inequality 


that is, if 




Since the right side of inequality (3) is greater than 1, it may be 
asserted that, for values of x satisfying this inequality, inequality 
(2) holds true. This proves the lemma. 

Lemma on the increase of the modulus of a polynomiaL For 
every polynomial f (x) of degree not less than unity with complex coef- 
ficients, and for any arbitrary large positive real number M, it is pos- 
sible to find a positive real number N such that /or |x| >« A it will be 
true that |/ (x) | > M. 

Let 

/ (x) == -r 
By formula (11), Sec. 18, 

I / (x) I - I oox" -i- . . . + o„) I 

> 1 I — I -h . . . -f ( (4) 


Apply the lemma on the modulus of the highest-degree term, putting 
k = 2; there is a number A', such that for | x ) > it is true that 


whence 


I OoJ:” I > 2 I aix”-^ -f 




• « 
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that is, by (4), 


l/WI 




a^.i 


.n 


The right side of this inequality is greater than M for 


n 

= -Vf. 


"0 



0 

Fig. 8 


Thus, for I x\'>‘ N = max (N^, A\) we have |/ (x) | >■ A/. 

The meaning of this lemma may be illustrated geometrically 
(we will frequently make use of this illustration). Suppose thafat 
every point xq of the complex plane a per- 
pendicular is erected whose length (for 
the given scale unit) is equal to the mo- 
dulus of the value of the polynomial 
/ (x) at this point, that is, is equal to 
1/ (xo) I . The endpoints of the perpendi- 
culars will, in view of the above-proved - 
continuity of the modulus of a polyno- 
mial, constitute some continuous curved 
surface situated above the complex plane. 

The lemma on the increase of the modulus of a polynomial shows that 
as I Xq I increases this surface recedes from the complex plane, though 
quite naturally the recession is not in the least jnonotonic. Fig. 8 
is a schematic view of the line of intersection of this surface with 
a plane perpendicular to the complex plane and passing tlirougli 
the point 0. 

The following lemma plays a crucial role in tlic proof. 

D’Alembert’s lemma. If for x = xq the polynomial f (x) of degree 
n, n 1, does not vanish, f (xo) 0 and therefore \ f (xo) | > 0, then 
it is possible to find an increment h {complex in the general case) such 
that 

I / (xo 4- A) |< I / (xo) 


If the increment h is as yet arbitrary, then Taylor’s formula 
yields 

/ (•*'0 '\-h) = f (xo) -r hf' (.r^) -j- ^ f" (X(,) 


By hypothesis. Xo is not a root of / (x). It may, however, fortuitously 
be a root of /' (x) and perhaps also of certain other higher deriva- 
tives. Let the /rth derivative (/c > 1) he tlie first that does not have 
Xo for a root, that is, 

r (Xo) - /" (Xo) /«''-» (x„) .. 0 , /-"> (xo) 0 

Such a k exists since if a^ is the leading coefficient of the polynomial 
/ (x), then 

/'”* ( xq ) = nloQ ^0 


10* 
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Thus, 

/ (-ro + ft) = / W + 4r f'"' f'"'" + • • ■ + ^ 

Some of the numbers /''***’ (zo)) • - ■. W may also be zero, 
but this does not affect our reasoning in any way. 

Dividing both sides of the equation by / (a:o), which, by hypothe- 
sis, is different from zero, and introducing the notation 

(To) 


we get 


- n / do) ’ 


/ = A:, A -pi, 


n 


= 1 + cnh<‘ + + 

or, because Ch =5^0, 

= (1 +c»fe'‘)+ ft+ 
Taking moduli, we get 




+ 1 


/ (^Tq + ft) 
/(^o) 


1 -}- Cfth'* I -f I Cft/i 




h 






r^) 


Up to this point we have not made any assumptions concerning 
the increment h. Now' we will choose h: we choose the modulus and 
ihe argument separately. We choose the modulus of h in the follo- 
wing manner. Since 


‘‘h 


h 


i.n-h 


-t- — /i 
Ch 


is a polynomial in h without the constant term, it follows by Lemma 1 
^setting e = that there is a 6i such that for | // | < it will 
he true that 

I a 

(fi) 


Ch*i 

Ck 


Cn i.n-fe 


/j -f- . . . -f — /i 


Ck 


<1 


On the other hand, for 


we have 


\k\<6.=P^\ck r 

i l< 1 (7) 

Assume that the modulus of h is chosen in accord with the inequality 

I 1 < min (6„ 60) (8) 

Then, because of (6), inequality (5) becomes the strict inequality 


/(^+/0 

/(^o) 


<|i + ckfe'‘|+4-k/.ft'’ 


(0) 


We will use Condition (7) later on. 
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To choose the argument of h we require that the number 
be a negative real number. In other w’ords, 

arg (cft/i'O = arg + A: arg = Ji 

whence 

( 10 ) 


In this choice of K the number will diSer from its absolute value 
in sign: ^ , 

= - 1 I 

and therefore, using inequality (7), 

I 1 + Cftfe" I = I 1 - I I I = 1 - i I 

Thus, for h chosen on the basis of the Conditions (8) and (10), 
inequality (9) takes the form 

7 (^o) 

and all the more so 

/ (-To) I / (^o) 1 

wlience it follows that 

\ f {xo -r h) I <; I / (xo) I 


This completes the proof of d’Alembert’s lemma. 

Using the geometric interpretation given earlier, we can describe 
d’Alembert’s lemma in the following fashion. Given that | / (xo) 1 > 
>• 0. This means that the length of the perpendicular erected to the 
complex plane at point xq is nonzero. Then, by d’Alembert’s lemma, 
there is a point xi = x© + ^ such that | / (x,) I < I / (^o) | : lhat is, 
the perpendicular at the point xj will be shorter than at the point 
xo and, consequently, the surface formed by the endpoints of the 
perpendiculars will at this new point be somewhat closer to the 
complex plane. As the proof of the lemma shows, the modulus 
of h may be taken as small as we wish; in other words, the point xi 
may be chosen arbitrarily close to the point x©. However, we will 
not take advantage of this remark in the future. 

Obviously, the roots of the polynomial / (x) will be those com- 
plex numbers (or those points of the complex plane) at which the 
surface formed by the endpoints of the perpendiculars touches this 
plane. It is impossible to prove the existence of such points by 
relying on d’Alembert’s lemma alone. Indeed, using this lemma it is 
possible to find an infinite sequence of points x©, Xi, Xj, . . .» 
such that 


I / (Xo) I > 1 / (x,) I > I / (Xs) I > . . . 


( 11 ) 
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However, it ^oes not follow from this that there exists a point x 

such that / (x) = 0, all the more so that the decreasing sequence 

of positive real numbers (11) does not necessarily have to tend 
to zero. 

The considerations that follow are based on a theorem from the 
theory of functions of a complex variable that generalizes the 
Weierstrass theorem, which is familiar to the reader from the course 
of mathematical analysis. It has to do with real functions of a com- 
plex variable, that is with functions of a complex variable that 
take on only real values. The modulus of a polynomial is an instance 
of such functions. For the sake of simplicity, in the statement of 
this theorem we will speak about a closed circle E to be understood 
as a circle in the complex plane with all boundary points included. 

if a real function g (x) of a complex variable x is continuous at all 
points of a closed circle E, then there exists in E a point Xo such that 
for all X in E the inequality g (x) > g (xo) holds. Consequently, the 
point xq is the minimum point of g (x) in the circle E. 

The proof of this theorem is given in all courses of complex 
lunction theory and so we omit it. 


We confine ourselves to the case when the function g (x) is non- 
negative at all points of only this case is of interest to us— and 
will try to explain this theorem geometrically with the aid of the 
illustration used earlier. Draw a perpendicular of length g (xn) at 
every point xq of the circle E. The endpoints of these perpendiculars 
constitute a piece of a continuous curved surface, and due to the 
closed nature of the circle E the existence of minimum points of this 
piece of surface is geometrically clear. This illustration does not 
ot course take the place of a proof of the theorem. 

We can now take up the proof of the fundamental theorem itself. 
Let tliere be given a polynomial / (x) of degree n, n > 1 If its 
constant term Is a,, then obviously / (0) = Let us apply to our 
polynomial the lemma on the increase of the modulus of a polvno- 
mia assuming iU = | / (0)| == | |. Consequently, there exists 
an A such that for [x| > .V it will be true that \f (x) I > 1/ (0) I . 
It IS tlieri obviou.s that the above-indicated generalization of the 
Weierstrass theorem is applicable to the function \ f (x) I for any 
choice of the closed circle E. For E wo take a closed circle of radius 
A' with centre at 0. Let point xo be the minimum point of | / (x) I 
in whence, in particular, it follows that I / (xo) | < i / (0) f. 

It is ca.sy to see that xo will actually serve as minimum point of 
\ f (x) I over the entire complex plane: if the point x' lies outside £*. 
then I X I >» N and for this reason 


1 / (^') I > i / (0) I > I / (xo) I 

Whence it follows, finally, that / (x^) - 0, or that xo a root 

of f (x). If wehnil hadf{xo) # 0, then, by crAleniberl's lemma, there 
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would be a point Xi such that 1 / (xj) |< 1 / (xo) |. However, this 
contradicts the property of point xo that we have just established. 
Another proof of the fundamental theorem will be given in 

Sec. 55. 

24. Corollaries to the Fundamental Theorem 

Suppose we have a polynomial of degree n, n > 1, 

f (x) = flox” -t* flix” * "r . . • T ^n-l^ '1“ (1) 

with arbitrary complex coefficients. We again regard it as a formal- 
algebraic expression which is fully defined by the set of its coeffi- 
cients. The fundamental theorem on the existence of a root that 
was proved in the preceding section permits asserting the existence 
of a complex or real root ct, of / (x). Therefore, the polynomial / (x) 
has the factorization 

/ (x) = (x — a,) (p (x) 

The coefficients of the polynomial (p (x) are again real or complex 
numbers, and therefore f (x) has a root a., whence 

/ (x) = (x — a,) (x — ttg) ip (x) 

Continuing- in similar fashion, we arrive— after a finite number 
of steps— at a factorization of the nth-degree polynomial / (x) into 
a product of n linear factors, 

/ (x) = flo - «i) (x - tta) . . . (x - a,.) (2) 

The coefficient a^, is a result of tlie following: if we had a coeffi- 

cient b on the right of (2), then after removal of parentheses the 
highest-degree term of the polynomial / (x) would be of the form 
bx". though in reality, by (1), it is the term Therefore, b = Qq. 

fur the polynomial f (x), expansion (2) is, to within the order of 
the factors, a unique expansion of that type. 

Let tliere be yet another expansion 

/ (x) ao (x — |ii) (x — Pa) - . . (x — P,.) (3) 

From (2) and (3) follows the equation 

(j: — ctj) (x — ri.*} • • • (x ci,i) (x (ij) (x [io) . . . {x j"!,,) ( i) 

If the root a; were different from all (ij, / = 1, 2, . . ,, n, then, 

substituting in place of the unknown into (4), we would have 
zero on the left and a nonzero number on the right. Tlius. every 
rout ccj is equal to some root and conversely. 

From this it does not yet follow that the expansions (2) and 
(3) are coincident. Indeed, there may be equal roots among the 
roots CC|, i = 1, 2, . . n. For example, let s of those roots bo 
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equal to ai and, on the other hand, let there be t roots equal to the 
root tti among the roots P;, j = 1, 2, . . n. We have to show that 

s = t. 

Since the degree of a product of polynomials is equal to the 
sum of the degrees of the factors, the product of two polynomials 
different from zero cannot be zero. It then follows that if two pro- 
ducts of polynomials are equal, then a common multiple can be can- 
celled from both sides of the equation: if 

/ (x) cp (x) = ^ (x) (p (x) 


and cp (x) ^^0, then from 

1/ (^) - g W] T (a:) = 0 


it follows that 
that is, 


/ (x) — g (x) = 0 
f {x) = g (x) 


Let us apply this to equation (4). If, for instance, s > L then 
by cancelling the factor (x — aj)' out of both sides of (4), we arrive 
at an equation whose left side contains the factor x — aj and whose 
riglil side does not contain it. But it has been shown that this is 
a contradiction, which proves the uniqueness of the expansion (2) 
of the polynomial / (x). 

Collecting like factors, we can write (2) as 

/ (x) = Oq (x — ai)''* (x — ct,)''" ... (x — (5) 

wliere 

/ij /lo ~r • • ‘ ‘V hi ~ n 


Jt is now assumed that there are no equal roots among the roots 

J ( Ci •> y • • • y CC ^ « 

Wo will prove that the number k'l of (")), i = 1, 2, .... I, is the 
multiplicity of the root ttj in the polynomial f (x). Indeed, if this mul- 
tiplicity is equal to S/, then A*,- s^. However, let <C Si- By virtue 

of the definition of multiplicity of a root of / (x), we have the expan- 
sion 

/ W = [x — OCi)'* (p (x) 

Replacing in this expansion the factor ip (x) by its factorization 
into linear factors, we would get for / (x) a factorization into linear 
I'actors that is definitely different from (2); in other words, it would 
conlradicl the above-proved uniqueness of the expansion. 

We have thus proved the following important result. 

Any polynomial f (x) of degree n, n ^ 1, with arbitrary numerical 
coefficients has n roots if each of the roots is counted to the degree of its 
multiplicity. 
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Note that this theorem holds true for « = 0 as well, since a poly- 
nomial of zero degree quite naturally has no roots. This theorem 
is not applicable only to the polynomial 0, which has no degree 
and is equal to zero for any value of x. We use this last remark in 
the proof of the following theorem. 

If Ole polynomials f (x) and g (x) whose degrees do not exceed n have 
equal values for more than n distinct values of the unknown, then 
f {x) = g (x). 

Indeed, the polynomial / (x) — • g (x) has, by hypothesis, more 
roots than n, and since its degree does not exceed n, the equation 
/ (x) — g (x) = 0 must be true. 

Thus, taking into account that there is an infinity of different 
numbers, we can assert that for any two distinct polynomials f (x) 
and g (x) there will be values c of the unknown x such that f (c) 

^ g (c). Such c may be found not only among the complex numbers 
but also among the real numbers, rational numbers and even the 
integers. 

Consequently, two polynomials with numerical coefficients 
having different coefficients of at least one power of the unknown x 
will be distinct complex functions of the complex variable x. Finally, 
this proves the equivalence, for polynomials with numerical coefficients, 
of the two definitions of equality of polynomials given in Sec. 20: the 
algebraic definition and the function-theoretic definition. 

The theorem proved above permits us to assert that a polynomial 
whose degree does not exceed n is completely determined by its values 
for any distinct values of the unknown whose number is greater than n. 
Can these values of the polynomial be specified arbitrarily? If we 
assume that the values of a polynomial are given for n -f 1 distinct 
values of the unknown, then the answer is yes: there always exists 
a polynomial of degree not higher than n which takes on preassigned 
values for n + i specified distinct values of the unknown. 

Indeed, let it be necessary to construct a polynomial of degree 
not higher than n, which, for values of the unknown a^, c,, . . . 

. . ., Cn+i (assumed distinct), take.s on, respectively, the values 
Cj, . , ., c„+i. The polynomial will be 

n+l 

— fli-i) Iai-<11+|) ... (fli — 

Indeed, its degree does not exceed n and the value of / (a,) is equal 
to C/. 

Formula (6) is called the Lagrange interpolation formula. The 
term “interpolation" is due to the fact that, using this formula and 
knowing the values of the polynomial at n -f- 1 points, it is possible 
to compute its values at all other points. 
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Vieta’s formulas. Let there be given a polynomial / (x) of degree 
n with leading coefficient 1, 

/ (x) - x" 4- a,x"-^ 4- 4- . ■ • 4* 4- U) 

and let ai, a^, he its roots (counting multiplicities). Then 

/ (x) has the following expansion: 

/ (x) - (x - ai) (x - ct.) . . . (x - a„) 

Multiplying out the parentheses on the right, and then collecting 
like terms and comparing the resulting coefficients with the coeffi- 
cients of (7), wc get the following equations, called Vieta's formulas, 
which express the coefficients of a polynomial in terras of its roots: 

— “(CCj CCo ”1” • • • ~ ^/i)’ 

tfj -- ajCCo -i- ocicca 0C|a„ . -f- oc„_ian, 

02 = — CCiCCoCt^ -r . . • 


* CC|GCo • • • 





( 1 ^ ( — 1) CtjCCo • « . CCfj 

Tims, the right side of the /rth equation, A- == 1, 2, . . n, con- 
tains a sum of all possible products of k roots taken with the plus 
or minus sign, according as k is even or odd. 

' For n ■--- 2. these formulas become the familiar (from elementary 
abmbra) relatioiisbip between the roots and the coefficients of a quad- 
raUc polymniiial. For n = 3, that is. for a cubic polynomial, these 

lornuilas lake tlie form 


a, -(:<! -r a. - "Xa), a., =- - a,a3 4- = —axa^O-z 


The Viein formulas simplify writing a polynomial, given ils 
roots. For instance, find the fourth-degree polynomial / (x) which 
ha'^ the simj»le roots and —2 and the double root 3. We gel 



ami tlieret'ore 

/ (j) .(* — : 17.r- 33x — DO 

If the leading coefficient Uo of the polynomial / (x) is diHerent 
ffoiii iinily, tiiLMi in order to make use of Vieta’s formulas, it is first 
iiecessarv to divide all the coefficients by flo; this has no effect on 
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the roots of the polynomial. Thus, in this case the Vieta formulas 
yield an expression for the relation of all coefficients to the leading 
coefficient. 

Polynomials with real coefficients. We now derive some corolla- 
ries to the fundamental theorem of algebra which refer to polyno- 
mials with real coefficients. Actually, it is precisely from these 
corollaries that the great significance of the fundamental theorem 
of the algebra of complex numbers stems. 

Let the following polynomial with real coefficients 

/ (x) = + . . . + On.tX + dn 

have a complex root a, that is, 

floa" + flia""* + . . . + -^- 0^=0 

We know that this equation is unaffected by changing all the num- 
bers to their conjugates; but all the coefficients ao» 
and also the number 0 on the right, being real, will remain unchan- 
ged in such a substitution, and we arrive at the equation 

floa" + aitt"'* + . . . + = 0 

that is, _ 

f {a) =0 

Thus, if a complex {but not real) number a serves as a root of a poly- 
nomial f (a:) with real coefficients, then the conjugate number a will 
also be a root of f (x)- 

Consequently, the polynomial / (x) will be divisible by tlie 
quadratic trinomial 

(p (:c) = (x — a) (x — a) = x^ — (a + a) a: + aa (8) 

whose coefficients, as we know from Sec. 18, are real. Taking advan- 
tage of this fact, we will prove that the roots a and a have one and 
the same multiplicity in the polynomial / (x). 

Indeed, let these roots have, respectively, the multiplicities h 
and I and, say, let k > 1. Then / (x) is divisible by the /th power 
of the polynomial <p (x), 

/ (^) = (^) q (^) 

The polynomial q (x), as a quotient of two polynomials with real 
coefficients, also has real coefficients, but, in conflict with what 
was proved above, i^ has the number a for its {k — /)-fold root, 

whereas the number a is not one of its roots. This moans that k = 1. 

Now we can say that the complex roots of any polynomial with 
real coefficients are pairwise conjugate. From this fact and from the 
earlier proved uniqueness of expansions of type (2) follows the 
final result. 
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Any polynomial f (x) with real coefficients can be expressed uni- 
Quelu (to within the order of the factors) in the form of a product of its 
leading coefficient Oq o.nd several linear polynomials with real coef- 
ficients-of the form x-a that correspond to its real roots-and 
quadratic polynomials of the form (8) that correspond to pairs of conju- 
gate complex roots. , 

For what follows it will be useful to stress that among polyno- 
mials with real coefficients and leading coefficient 1, only linear 
polynomials of the form x - a and quadratic polynomials of the 
form (8) are irreducible (that is, cannot be decomposed into factors 

of lower degree). 

25. Rational Fractions 

The course of mathematical analysis deals with integral rational 
functions (which we have called polynomials) and also fractional 

rational functions. The latter are quotients ^^of two integral 

rational functions, where g (x) ^ 0. Algebraic operations are per- 
formed on these functions in accord with the same laws as a^ used 
to manipulate rational numbers, that is to say, fractions with inte- 
gral numerators and denominators. The equality of two fractional 
rational functions, or, as we will now term them, rational fractions, 
is to bo understood in the same sense as the equality of fractions 
in elementary arithmetic. For the sake of definiteness, we consider 
rational fractions with real coefficients. The reader will easily note 
that this whole section can almost literally be extended to the case 
of rational fractions with complex coefficients. 

A rational fraction is in lowest terms (simplified) if the numerator 
is relatively prime to the denominator. 

Any rational fraction is equal to some fraction in lowest 
which is uniquely defined to within a zero-power factor common to both 
numerator and denominator. 

Indeed, anv rational fraction may be reduced by dividing nume- 
rator and denominator by the greatest common divisor; th^is yields 
an equivalent fraction in lowest terms. If, moreover, we ha\e two 

simplified fractions ^ and ‘^|fhat are equal, that is 

/ (x) (x) = g (x) (p (x) (1) 

then it follows from the relative primalily of / (x) and g (x) [by 
Properly (h) of Sec. 211 that / (x) divides cp (x), and from the rela- 
tivo primality of cp (.r) ond ij? {^) ^ i^) / (j). Thus, 

/ (x) ^ c(p (x), and then from (1) it follows that g (x) = c\|) (x). 

A rational fraction is proper if the degree of the numerator is 
less than the degree of the denominator. If we include the polyno- 
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mial zero in the set of proper fractions, then the following theorem 

Any rational fraction may be represented uniquely in the form of 
c sum of a polynomial and a proper fraction. 

If there is a rational fraction^ and if, dividing the numerator 

by the denominator, we get the equation 

f (x) = g (x) 9 ^ (^) 

where the degree of r {z) is less than the degree of g (x), then it is 
easy to check that 

g{x) 

If we also have the equation 


^q{x) 


(p(z) 


g(x) 

where the degree of 9 (x) is less than the degree of (x). then we 

obtain the equation 

~ ({ (z) r ( J) ({ (x) (J) r (j) 

q {•^) — q (^) — (xj g (z) ijj (z) g (z) 

Since the left-hand side is a polynomial, and Uie right, as is easily 
seen, is a proper fraction, we gel q (x) - q (x) = 0 and 

<P (z) >• (3r) _ Q 

^1’ (z) S (X) 

Proper rational fractions can be studied further. As was pointed 
out at the end of the last section, irreducible real polynomials are 
polynomials of the form x - a, wh^re the number a is real^and 

polynomials of the form x" - (fi + P) x 4- pp, where p and p are 
a pair of conjugate complex numbers. It is easy to verify that in the 
complex case a similar role is played by polynomials of the form 
X — a, where a is any complex number. 

A proper rational fraction is called a partial fraction if 

its denominator g (x) is a power of the irreducible polynomial p (x), 

g (x) = (x), k > 1 

and the degree of the numerator / (x) is less than that of p (x). 
The following fundamental theorem holds. 

Any proper rational fraction can be decomposed into a sum of par- 
tial fractions. 

Proof. We first consider the proper rational fraction , 

where the polynomials g (x) and h (x) are relatively prime, 

{g (x), h (X)) = 1 
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Thus, by Sec. 21, there are polynomials u (a:) and v (x) such that 

g (x) u (jt) -f h (x) v{x) = \ 

Whence 

g (x) [w (x) / (x)I + k (x) [v (x) / (x)] = / (x) (2) 

Suppo.se, in dividing the product u (x) / (x) by h (x), we get a remain- 
der u (x) whose degree is less than the degree of h (x). Then (2) may 
be rewritten in the form 


g (x) u (x) + h (x) V (x) = / (x) (3) 


where v (x) is a polynomial whose expression could readily he writ- 
ten. Since the degree of the product g (x) u (x) is less than the degree 
of the product g (x) h fx) and this, by hypothesis, is true for the 
polynomial / (x), it follows that the product h (x) v (.r) also has 
degree less than that of g (x) h (x), and therefore the degree of v (x) 
is less than that of g (x). From (3) there now follows the equation 

/ fj) _ t’ (j) , »(•!:) 
g (x) h (x) g (x) /t (x) 


the right member of which is a sum of proper fractions. 

If even one of the denominators g (x), h (x) can be factored into 
a product of prime factors, then a further decomposition is possible. 
Continuing in the same manner, we find that any proper fraction 
can be decomposed into d sum of several proper fractions, each of which 
has for the denominator a power of some irreducible polynomial. More 


precisely, if we are given a proper fraction 


m 

g (X) ’ 


whose denominator 


can be factored into the irreducible factors 


g{x) = pf‘ (x) P 2 =(.r) ,..p'!‘{x) 




(of cour.se, one can always say that the leading coefficient of the 
denominator of a rational fraction is unity), and p^ (x) pj (x) 
for i ^ /, ilien it follows that 


/(x) 


f/|(x) , »2(x) 

pf' (^) ' (■■■) 



III (x) 



All the lerm.s on the right of this equation are proper fractions. 

It remains to consider a proper fraction of the form , where 

pfc (X) 

p (x) is an irreducible polynomial. Applying the division algorithm,, 
divide u (x) by (x), divide the remainder by p**"- (x), and so on. 
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We arrive at the following equalities: 

u (x) ^ p'*"* (j) Si (x) -t- (x), 

«i (i) = (^) (-r) -t“ «2 (-r). 

Wh -2 (-r) = P W + Wft-i 

Since the degree of u (x) is, by hypothesis, less than the degree of 
p'‘ (x), and the degree of each of the remainders (x), i = 1, 2, . . . 

, . A: — 1, is less than the degree of the corresponding divisor 
p'‘~^ (x), it follows that the degrees of all quotients 5, (x), s, (^). ■ • ■ 

. . ., (x) will be strictly less than the degree of the polynomial 

p (x). The degree of the last remainder Uk-t (x) is also less than the 
degree of p (x). It follows from the equations obtained that 

u (x) = p'*"^ (x) Si (x) + p'‘"* (x) S 2 (x) + . . . 

. . . + p U) Sk-i (x) 4- Uh-i (a-) 
whence we arrive at the desired representation of the rational frac- 
tion as a sum of partial fractions: 

pfc (z) 

u(x) , Sk-i(x) ■ . . siix) 

ph (j) (i) p'*“l (x) ‘ p- (z) ' p (x) 

The proof of the fundamental theorem is complete. It may he 
supplemented by the following uniqueness theorem. 

Every proper rational fraction has a unique decomposition into 
a sum of partial fractions. 

Let some proper fraction be decomposable into sums of partial 
fractions in two ways. Subtracting one of these representations 
from the other and collecting like terms, we get a sum of partial 
fractions identically equal to zero. Let the denominators of the 
partial fractions which constitute this sum be certain powers of 
distinct irreducible polynomials pi (x), pg (x), . . Ps (x) and let 
the highest power of the polynomial pi (x), i = 1, 2 s, which 

is one of these denominators, be p.'* (x). Multiply both sides of the 

equality at hand by the product pJ'-‘ (x) pj* (x) . . . p)^ (x). Then 
all the terms of our sum, except one, become polynomials. The term 

4^ is converted into a fraction whose denominator is p* (x) 

f'f'(x) 

and whose numerator is the product u (x) pj^ (x) . . . p,«(x). The 
numerator is not exactly divisible by the denominator since the 
polynomial pi (x) is irreducible, and all the factors of the numerator 
are relatively prime to it. Performing division with a remainder, 
we find that the sum of a polynomial and a nonzero proper fraction 
is equal to zero, which is impossible. 
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Example. Decompose into a sum of partial fractions the real proper frac- 

/ (x) , 

where 

^ (x) 

/ (x) = 2 ii - iOx^ -r 7x2 ^ 4 ^ _{. 3 , 
g (x) = i3 — 2x® + 2x2' — 3x -{" 2 


It is easy to check that 

g(x)- (x^- 2 ) (x-l )2 (x 2 + i) 

Eachofthepolvnomialsx+ 2,x - l.x2+ 1 is irreducible. From the foregoing 
theory it follows that the desired decomposition should be of the torm 

1 ^ 1^ I (4) 

g(x)“x4-2 ' (x-l)2^x-l^ x2i-l 

where the numbers /I, B, C, D and E have still to be found. 

From (4) follows the equation 

/ (X) = .1 (X - 1)2 (x2 + 1 ) -f B (x -f 2) (x2 + 1) + C (x + 2) (X - 1) (x2+l) 

-L Dx (x -L 2) (x - I)= + B (X -}- 2) (X - 1)2 (5) 

Equating coefficients of like powers of the unknown x in both members of (5), 
wc would cot ti systom of fivo linear e(|uat ions iu live unknowns t\ 

and, as follows from what has been said, this system has a unique solution. 

However, we will take a different approach. 

•\‘ 5 suraing x= — 2 in (5), we got the equation 4o/l = 13o, whence 

4 = 3 (6) 


Pulling x = 1 in (5), we get 6 B = 6 , or 


B = 1 


(7) 


Now. in succession, set x = 0 and i = — 1 in (5). Using ( 6 ) and (7), we get the 
equations 

- 2C + 2B = -2, 

„ 4C - 40 -j- 4B = -8 I (8) 

0=1 (9) 

Now, finally, set x = 2 iufS). Using ( 6 ), (7), and (9). we arrive at the equation 

20c + -IB = -52 

which, together with the first equation of ( 8 ), yields 

C = -2, E = -3 

Thus, 

i{i\ 3.1 2 , x-3 
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26. Reducing a Quadratic Form to Canonical Form 


The genesis of the theory of quadratic forms lies in analytic 
geometry, namely, in the theory of quadric curves and surfaces. 
It will be recalled that the equation of a central quadric curve 
in a plane, after translating the origin of the rectangular coordinate 
system to the centre of the curve, is of the form 

Ax- -i- 2Bxy Cl/ =D (1) 


It is also possible to perform a rotation of the coordinate axes through 
an angle a, such that we iiave the following transformation from 
the coordinates j, y to the coordinates x , y : 


X = x' cos a — i/ sin a, 1 
f/ = x' sin a -j- ij cos a / 



Then the equation of our curve in the new coordinates will be of 
“canonical” form: 

= ^ ( 3 ) 


In this equation, tlie coefiicicnt of tlie product of unknowns x'lj 
is, thus, zero. The transformation of coordinates (2) may obviously 
be interpreted as a linear transformation of the unknowns (see 
Sec. 13); the transformation is nonsingular since the determinant 
of its coefficients is equal to unity. This transformation is opplieil 
to the left side of (1) and for this rea.son we can say that the left 
member of (1) is converted into the left side of (3) by the nonsingular 
linear transformation (2). 

Numerous applications required the construction of a similar 
theory for the case when the number of unknowns is equal to an 
arbitrary n instead of two, and the coefficients are either real or any 
complex numbers. 

Generalizing the expression on the left of (1), we arrive at the 
following concept. 
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A quadratic form f inn unknowns xi, Xj, . . x„ is a sum, each 
term of which is either a square of one of the unknowns or a product 
of two difierent unknowns. A quadratic form is called real or complex 
according as its coefficients are real or complex numbers. 

If we take it that like terms in the quadratic form / have already 
been collected, we can introduce the following notations for the 
coefficients of this form; we denote by the coefficient of Xi, and 
by lau [compare with (1)!1 the coefficient of the product XiX} for 
i However, since x.Xj — x^x,-, the coefficient of this product 
could be written as that is, the designations we have proposed 
presume the validity of the equality 

aji = aij (4) 

The term 2aj;x,x; may now be written as 

2aijXiXj = aijXiXj + ajiXjXi 

and the entire quadratic form / may be written in the form of a sum 
of all possible terms OijXiXj, where i and / independently take on the 
values from 1 to n: 

/ = S 2 (5) 

i=i j=i 

In particular, for i — j we have the term a/fX?. 

Obviously, wo can construct a square matrix A — (an) of order 
n out of the coefficients an’, it is called the matrix of Die quadratic 
form /, and its rank r is called the rank of the quadratic form. If, 

say, r = n, that is, the matrix is nonsingular, then the quadratic 

form f is termed nonsingular too. Due to (4), the elements of matrix A 
which are symmetric about the principal diagonal are equal; that 
is, matrix A is a symmetric matrix. Conversely, for any symmetric 
matrix A of order n there is a definite quadratic form (5) in n 
unknowns having for coefficients the elements of the matrix A. 

The quadratic form (5) may be written differently by using the 
multiplication of rectangular matrices introduced in Sec. 14. Let 
us make the following convention: if we have a square or, generally, 
rectangular matrix A, then A' will denote the transpose of A. If 
matrices A and B are such that their product is defined, then we- 
have the equality 

[ASy = B'A' (6> 

Thus, the transpose of a product of matrices is equal to the product 
of the transposes of the matrices in reverse order. 

Indeed, if the product AB is defined, then, as may easily be 
verified, the product B'A' will also be defined: the number of columns 
of matrix B' is equal to the number of rows of matrix A'. The ele- 
ment of matrix (-45)' in the ith row and /th column lies in the ;th 
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row and ith column of the matrix AB. It is therefore equal to the 
sum of the products of the corresponding elements of the /th row 
of matrix A and the ilh column of matrix B, which is to say it is 
equal to the sum of the products of the corresponding elements of 
the /th column of matrix A' and the ah row of matrix B'. This 

proves (6). 

Note that the matrix A is symmetric if and only if it coincides 
with its transpose, i.e., if 

A' = A 

Now denote hy X the column made up of tlie unknowns: 

•Ti 

. ^2 

X = 


n 


X is a matrix with n rows and one column. Its transpose is the matrix 

X' ~ ^ 2 ’ • • *’ 

comprising a single row. . / v i. •.* 

The quadratic form (5) with matrix A = may now be wTitten 

as a product: ^ ^ 

Indeed, the product AX will be a matrix consisting of one column: 


AX=^ 


n 


j=i 


n 


j=l 


n 


ii=l 

Multiplying this matrix on the left by the matrix X\ we get a “mat- 
rix” consisting of one row and one column, namely, the right side 

What will happen to the quadratic form / if the unknowns 
xx, ^ 2 , . . Xn in it are subjected to the linear transformation 

n 

= 2j i == 1, 2, . . ., n (8) 


with the matrix Q = (9ih)? We will assume here that if the form / 
is real, then the elements of the matrix Q must be real. Denoting 

M M ^ 
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l>y Y ihe column of unknowns yj, yn^ let us write the linear 

transformation (8) in the form of a matrix equation: 

X = QY (9) 


whence, from (i>), 



Substituting (9) and (10) into (7), we get 

/ y' {Q'AQ) Y 

or 



wlierc 


/ = Y'BY 
B - Q'AQ 


The matrix B is symmetric since, because of (0). which is 
obviously true for any mimber of factors, and due to the equality 
.1' = .4. wlticli is equivalent to the symmetry of matrix A. we have 

B' = Q'A'Q = Q'AQ = B 


This is proof of tlie following theorem. 

.1 quadralic form in n unknowns having a matrix A is converted 
{ajler performing n linear transformation of the unknowns with matrix 
O) into a quadratic form in new unknowns, the product Q'AQ serving 
as the matrix of this form. 

Now assume that wo perform a nonsingular linear transforma- 
tion: that is. Q and, therefore, Q' too are nonsingular matrices. 
Ill this case, the product Q'AQ is obtained by multiplying matrix A 
l.\ the nonsingiilar matrices; for this reason, as follows from the 
iv.sults of Sec. 14. the rank of this product is equal to the rank of 
matrix A. riuis. the rank of a quadratic form does not change under 
(I muisingulur linear transformation. 

By analogy with the geinnetric problem, indicated at tho begin- 
ning of this si'Clion. of reducing the equation of a central quadric 
curve to canonical form (3). let us now consider the question of 
reducing an arbitrary quadratic form (by some nonsingular linear 
transformation) to a sum of squares of the unknowns, that is to say, 
to a form where all coefficients of products of di.slinct unknowns are 
zero, riiis special form of the quadratic form is called canonical. 
Tir.'^t, lei us sujipose that a quadratic form finn unknowns Xu x.. . . . 

. . has already been reduced (via a nonsingular linear trans- 

fcunialion) to tbe canonical form 



where //j. y.^ tjrx ore the new unknowns. Some of the coefti- 

cieiit.s hi, bl, . . .. may of course he zeros. We wdll prove that 
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the number of nonzero coefficients in (11) is invariably equal to the 

rank r of the form f. ^ • i . 

Indeed, since we reached (11) by means of a nonsingular tran.s- 

formation, the quadratic form on the right of (11) must also be of 

rank r. But the matrix of this quadratic form is diagonal: 



and a requirement 
supposing that its 
elements. 

We now take up 


that this matrix have rank r is equivalent to 
principal diagonal contains exactly r nonzero 

the proof of the following fundamental theorem 


on quadratic forms. , , . • i / i 

Anu Quadratic form may be reduced to canonical form by means 

of a nonsingular linear transformation. If a real quadratic form is under 

consideration, then all the coefficients of this linear transformation may 

be taken to be real. , , r 

This theorem is true for the case of quadratic forms in one un- 
known since every such form has the form ax\ which is canonical. 
We can there fore' carry out the proof liy induction with respect to 
the number of unknowns; that is, we can prove the theorem for 
quadratic forms in n unknowns, assuming it proved for forms witli 

a SDialler number of unknowns. 

Suppose wc iiavu tlie quadratic form 


n n 

/ := y V UijXiXj 
i=l 



in the n unknowns x„ x,.. We try to find a nonsingular 

linear transformation that isolates from / a square of one of the 
unknowns that is, such that reduces / to the form of a sum of this 
square and .some quadratic form in the remaining unknowns. Tliis 
is readily achieved if among the coefficients flu, fl 22 t • ■ •• hi 
the principal diagonal of the matrix of the form / there are some 
nonzero coefficients, that is to say, if the square of at least one of the 
unknowns x- enters into (12) with a nonzero coefficient. 

For example, lot flu ^ 0. Then it will be easy to see that the 
expression aj"/ (fluXi + flioXj + • • • “k ^ which is a quadra- 

tic form, contains the same terms with the unknown x, as our form 

/, and so the difference 

/ — a\\ (fluXj 4- . . . -r = g 
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is a quadratic form containing only the unknowns 
but not Xi. Whence 

/ “ ^11 T <^123^2 “i“ • * ■ “f” S 

If WO introduce the designations 

y\ ^ + 012^2 4- . . . + OinXn, IJi = Xi 

for z = 2 , 3 , . . n ( 13 ) 

« 

we ol)tain 

f = ^n!/I S 

whore g is now a quadratic form in the unknowns y.>, ija, . . jjn. 
Expression ( 14 ) is the desired expression for the form /, since it was 
obtained from (12) by a nonsingular linear transformation, namely, 
by a transformation inverse to the linear transformation ( 13 ). which 
has ^11 for its determinant and is therefore not singular. 

However, if we have the equalities an = ^22 = • • • == ^nn = 0 , 
tlien we first have to perform an auxiliary linear transformation 
that leads to the appearance, in our form /, of squares of the un- 
knowns. Since there must be nonzero coefficients among those in 
(12) of this form— otherwise there would be nothing to prove — 
su[)pnse, say, that a,, 0. i.e., / is the sum of the term Zai^XiX^ 

and of terms such that each contains at least one of the unknowns 

• T3 * . . Xfi* 

Let us now perform the linear transformation 
.ri ~ Zi — 2 .. .r., -- Zi 4- 22' -i foi* i = 3, . . ., n (15) 

II will be non.singular since it has the determinant 

1 -1 0 ... 0 
1 1 0 ... 0 
0 0 1 ... 0 = 2 ^0 


0 0 0 ... 1 I 

a result of this tran.'^formalion. the term 2a,2a'i^'2 of our form 
becomes 

2aj._>jriJ‘_, — ••^]2 {^1 " 2 ) ("i ~ *' 2 ) “ “fljoCj — 2aj.,Z2 

In other words in form / there will appear the squares of two un- 
knowns at once with nonzero coefficients; what is more, they do 
not cancel witli any one of the remaining terms, since each one 
of the latter contains at least one of the unknowns C3, . . 

W(‘ are now in the conditions of the case that has already been 
considered; one more non.singular linear transformation will reduce 
tile form / to the form ( 14 ). 
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To conclude the proof, note that the quadratic form g depends 
on a smaller (than n) number of unknowns and for this reason, by 
the induction hypothesis, it is reducible to the canonical form by 
means of a nonsingular transformation of the unknowns y^, • . . 
. . ijn. This transformation, which we regard as a (nonsingular, 
quite obviously) transformation of all n unknowns under which 
yi remains unchanged, consequently reduces (14) to canonical form. 
Thus, by means of two or three nonsingular linear transformations, 
which may be replaced by a single nonsingular transformation 
(their product), a quadratic form / may be reduced to a sum of squa- 
res of the unknowns with certain coefficients. And, as we know, the 
number of such squares is equal to the rank r of the form. If, besides, 
the quadratic form / is real, then the coefficients both in the cano- 
nical form of / and in the linear transformation which reduces / 
to this canonical form will be real; indeed, both the linear trans- 
formation which is inverse to (13) and the linear transformation 
(15) have real coefficients. 

The proof of the fundamental theorem is complete. The method 
employed in this proof can be used in specific examples for an actual 
reduction of a quadratic form to canonical form. It is only necessary, 
in place of the induction we used in the proof, to isolate the squares 
of the unknowns successively by the method given above. 


Example. Reduce to canonical form the quadratic form 

/ = 2xiXz — 6x213 + 2i3i, (161 

Since there are no squares of the unknowns in this form, we first perform 
a nonsingular linear transformation 

It = J/l — yz, X2 = i/j + 1/2. X3 = 1/3 


with the matrix 



This yields 

f = 2y\ — 2yl — Aijojz — 8^21/3 

Now the coefficient of y\ is nonzero, and so we can isolate the square of one 
unknown. Setting 

Zj = 2yx — 2i/3, Z2 = J/2, Z3 = 1/3 


that is, performing a linear transformation, the inverse of which has the matrix 


B = 


|oi 

0 1 0 
'.001 


we reduce f to the form 


1 
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So far only the square of the unkno^vTi zj has been isolated, since the form 
still contains the product of two other unknowns. Using the fact that the coef- 
ficient of is nonzero, we again apply the method described above. Performing 
the linear transformation 


— 2j, ^2 ” “•2 z2 — 4z3, ^3 = Z3 

the inverse of which has the matrix 



wo finally reduce the form / to canonical form; 






(17) 


The linear transformation that immediately reduces (16) to (17) will have 
for its matrix the product 



It is also possible, by direct substitution, to verify that the nonsingular 
(since the determinant is equal to — linear transformation 

=‘:7 + ~ ^2+3/3, 

«■ M 




^2 



^3 


converts (16) into (IT). 


The theory of reducing a quadratic form to canonical form is 
based on an analogy with the geometric theory of central quadric 
curves but it cannot be considered a generalization of this latter 
theory. Actually, in our theory we are allowed to use any nonsin- 
gular linear transformations, whereas reducing a quadric to canoni- 
cal form is achieved by applying linear transformations of a very 
special kind (2); these transformations are relations of the plane. 
However, tliis geometric theory can be generalized to the case of 
quadratic forms in Ji unknowns with real coefficients. The genera- 
lization, which goes by the name of reduction of quadratic forms 
\o principal axes, will be given in Chapter 8. 
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27 . Law of Inertia 

The canonical form to which a given quadratic form is reduced 
is by no means uniquely determined: any quadratic form may be 
reduced to canonical form in many different ways. Thus, the quad- 
ratic form / = 2x1^2 — + 2x3X1 that w’as considered in the 

preceding section can, by the following nonsiiigular linear trans- 
formation, 

xi = /i -r 3/2 + 2^3, 

•2*2 = — L — 2 ^ 3 , 

X3 = to 

be reduced to the canonical form 

/ = 2/^ -i- - 8/= 

which is different from the earlier obtained form. 

The question arises as to what these different canonical quad- 
ratic forms to which the given form / is reduced have in common. 
As we shall see, this question is closely connected with the following 
one: under what condition can one of the two given quadratic forms 
be carried into the other by a nonsingular linear transformation? 
The answer depends on whether we are considering complex or real 
quadratic forms. 

First suppose we are considering arbitrary complex quadratic 
forms; at the same time, let us assume we admit the use of nonsin- 
gular linear transformations also with arbitrary complex coeffi- 
cients. We know that any quadratic form / in n unknowns having 
rank r can be reduced to the canonical form 

/ = ^t!/l -r C2(/2 + . . . + Cry? 

where all the coefficients Ci, Cj, . . Cr are nonzero. Using the fad 
that we can take the square root of any complex number, let us 
perform the following nonsingiilar linear transformation: 

2 i = Kdi/i for i = 1, 2, .... r; zj = r/j for J = r + 1 , . . „ 

It reduces / to tlie form 

f = -r 4 + ■ ■ ■ + (1) 

which is called normal. This is simply the sum of the squares of r 
unknowns with coefficients equal to unity. 

The normal form depends solely on the rank r of the form /, 
that is, all quadratic forms of rank r can be reduced to one and the 
same normal form ( 1 ). Consequently, if forms / and g in n unknowns 
have the same rank r, then we can transform / to (1) and then (1) 
to g; in other words, there exists a nonsingular linear transformation 
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that takes / into g. Since, on the other hand, no nonsingular linear 
transformation alters the rank of the form, we arrive at the following 

result. , , . j 

Two complex quadratic forms in n unknowns can be carried one 

into the other by means of nonsingular linear transformations with 

complex coefficients if and only if these forms have one and the same 

rank. , , 

It very easily follows from this theorem that any sum of squares 

of r unknowns with any nonzero complex coefficients can serve as the 
canonical form of a complex quadratic form of rank r. 

The situation is somewhat more complicated if we consider 
real quadratic forms and— this is particularly important— if we 
allow only for linear transformations with real coefficients. Now 
not every form can be reduced to (1), since this might require taking 
the square root of a negative number. However, if we now use the 
term normal form of a quadratic form for the sum of squares of seve- 
ral unknowns with coefficients +1 or -1, then it is easy to show 
that any real quadratic form f may be reduced to the normal form via 
a nonsingular linear transformation with real coefficients.\ 

Indeed, the form / of rank r in n unknowns can be reduced to 
a canonical form that can be written as follows (the numbering 
of the unknowns may be changed if necessary): 

/ = <^iyl -r . - ■ "f Cki/l — ^h+iVln — • • • — Crgh 0 < /c < r 

where all the numbers C|, . . c*, Cft+i, - • are nonzero and 

positive. Then the nonsingular linear transformation with real 

coefficients 


2i 




Cilji 


for i = 1, 2. 


r, zj = ij} 


= II, for / = r + 1, 


n 


reduces / to normal form: 

f = z\ Z% -il+l • • • "r 

The total number of squares here is equal to the rank of the form. 

A real quadratic form may be reduced to normal form by many 
different transformations; however, to within the numbering of the 
unknowns, it can be reduced only to one normal form. This is demon- 
strated by the following important theorem, which is called the 

law of inertia of real quadratic forms. 

The number of positive and the number of negative squares in me 
normal form to which a given quadratic form with real coefficients 
can be reduced by a real nonsingular linear transformation is inae- 

nendent of the choice of the transformation. 

Indeed, let a quadratic form / of rank r in n unknowns Xu • 

he reduced to the following normal form in two ways: 


(2) 
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Since the transition from the unknowns Xj, Xo. . . Xn to the 
unknowns yi, i/ 2 > • • •» Vn was a nonsingular linear transformation, 
it follows, conversely, that the second set of unknowns will also 
be expressed linearly in terras of the first set with a nonzero deter- 
minant: 

71 

yi — digXg, 1 = 1,2, . . . , n (3) 

«s=l 


Similarly, 

n 

Zj ^ 2 7 ” 1 > 2, . . . , n ('i) 

*=i 

the determinant of the coefficients again being different from zero. 
The coefficients are real numbers both in (3) and in (4). 

Now suppose that k<l Write the system of equalities 

y^ = Oy . . yh ~ ^l+i — 0, . . ., = 0, . . ., 2^ — 0 (5) 


If the left members of these equalities are replaced by their expres- 
sions taken from (3) and (4), we get a system of n — I k homo- 
geneous linear equations in n unknowns xj, Xn, . . x^. The num- 

ber of equations in this system is less than the number of unknowns. 
For this reason, as we know from Sec. 1, our system has a nonzero 
real solution ai, a^. • • -i ctn- 

Now in (2) let us replace all y's and all z's by their expressions 
(31 and (41, and then let us substitute for the unknowns the numbers 


tti, ttj, 


a 


If for brevity the values of the unknowns yi and 
zy obtained in this substitution are denoted by tjj (a) and zj (a), 
then, by (5), (2) becomes 

—yhi (a) - . • • - (a) = 2; (a) + . . . -r 2? (a) ( 6 ) 


Since all the coefficients in (3) and (4) are real, all the squares in 
(6) are positive and for this reason (6) implies that all these squares 
are zero, whence follow the equalities 

2 , (a) = 0, .... 2 / (a) = 0 (7) 


On the other hand, by the very choice of the numbersa,,a 2 , • • 

Z[+i (a) = 0 2 r (a) = 0, . . 2 „ (a) = 0 (8) 

Thus, the system of n homogeneous linear equations 

2/ = 0, i = 1, 2, . . ., n 


in n unknowns xj, x^, . . ., x„ lias, by (7) and (8), the nontrivial 
solution tti, a^y . . that is, the determinant of this system 

must be zero. This however contradicts the fact that the transfor- 
mation (4) was presumed to be nonsingular. We have the same con- 
tradiction for I < ky whence follows k = I which proves the theorem. 
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The number of positive squares in the normal form to which 
a given real quadratic form / is reduced is called the positive index 
of inertia of this form; the number of negative squares is termed 
the negative index of inertia, and the number of positive indices 
diminished by the numbers of negative indices of inertia is the 
signature of the form /. Clearly, if we are given the rank of a form, 
any one of the three numbers just defined will fully determine the 
other two, and for this reason, we can speak of any one of the three 
numbers in subsequent formulations. 

We now prove the following theorem. 

Two quadratic forms in n unknowns with real coefficients are carried 
one into the other by real nonsingular linear transformations if and 
onhf if the forms have the same ranks and the same signatures. 

Indeed, let a form / be carried into a form g by a real nonsin- 
gular transformation. We know that this transformation does not 
alter the rank of the form. Neither can it change the signature, for 
then / and g would reduce to different normal forms, hut then / 
would reduce— in conflict with the law of inertia— to both these 
normal forms. Conveisely, if the forms / and g have the same ranks 
and the same signatures, then they reduce to one and the same nor- 
mal form and therefore can be carried into one another. 

If we have a quadratic form g in canonical form with nonzero 

real coefficients 

g “ h\y\ -\- b.,yl + ^ri/r (9) 


then the rank of this form is obviously equal to r. Taking advaiitage 
of the procedure used earlier of reducing such a form to the normal 
form, it is easy to see that the positive index of inertia of form g 
is equal to the number of positive coefficients in the right member 
of (tl). From this and from the preceding theorem we obtain the 
following result. 

A quadratic form f has form (9) as its canonical form if and only 
if the rank of f is equal to r and the positive index of inertia of this 
fvrni coincides with the number of positive coefficients in (9). 

Oeeoinposablo quadratic forms. By multiplying any two linear 

forms in n unknowns. 


flj.Ti ~r a2.T., 


+ ^71 


= -i- 




we obviously gel another quadratic form. Not every quadratic form 
Clin bo ro presented iis a product of two liiiesr forms and wa wish 
to derive the conditions under which this occurs, that is, the con- 
ditions under which a quadratic form is decomposable. 

.1 complex quadratic form f (xi, Xo. . . ., x^) is decomposable 
if and only if its rank is less than or equal to two. A real quadratic 
form f (.Ti, xo, . . ., Xn) is decomposable if and only if either its rank 
does not excet'd unity or the rank is equal to two and the signature is zero. 
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Let US first consider the product of the linear forms (p and \j-. 
If at least one of them is a zero form, then their product will be 
a quadratic form with zero coefficients, which means it has rank 0. 
If the linear forms cp and i|) are proportional, 

l|) = c(p 

and c ^0 and the form cp is nonzero, then, for example, let the coef- 
ficient fli be different from zero. Then the nonsingular linear trans- 
formation 

pi = aiXi + . . . -r a^Xn, Vi == i = 2, 3, . . n 

reduces the quadratic form to 

= cyl 

On tlie right is a quadratic form of rank 1, and .so the quadratic 
form (pt has rank 1. Finally, if the linear forms cp and ij' are not 
proportional then, say, let 

bi 

Then the linear transformation 

Pi ^ ~I' ^^2^2 ~~ ’ * * ^ ^H^n' 

yn — 6l.ri ~1~ bnX-^ bf^Xfiy 

y. =1 Xf for i = 3, 4, . . n 

will he nonsingular; it reduces the quadratic form to 

(pt = 1/1^2 

On the right is a quadratic form of rank 2, which in the case of real 
coefficients has a signature of 0. 

Let us now prove the converse. A quadratic form of rank 0 can 
of course be regarded as a product of two linear forms, one of which 
is a zero form. Next, a quadratic form / (xj, Xo? • • x^) of rank I 

is reduced by a nonsingular linear transformation to 

/ = cy], c ^0 

that is, to the form 

/ = (cpi) Pi 

Expressing pi linearly in terms of Xi, Xj, .... x„, we get a repre- 
sentation of the form / as a product of two linear forms. Finally, 
the real quadratic form / (xi, Xn. .... x^) of rank 2 and signature 0 
is reduced by a nonsingular linear transformation to 

/ = p! - P2 
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Any complex quadratic form of rank 2 can be reduced to this same 
form. However, 

\y\ — y\^ (i/i — (i/i + 

but after replacing and y. by their linear expressions in terms 
of xi, X., . . we will have on the right a product of two linear 

forms. This proves the theorem. 

28. Positive Definite Forms 

A quadratic form f in n unknowns with real coefficients is called 
positive definite if it can be reduced to a normal form consisting- 
of n positive squares, that is, if both the rank and the positive 
index of inertia of this form are equal to the number of unknowns. 

The following theorem enables us to characterize positive definite 
forms without reducing them to normal form or canonical form. 

A quadratic form f in n unknowns Xu with real coef- 

ficients is positive definite if and only if for all real values of the un- 
knowns, at least one of which is nonzero, the form receives positive 
values. 

Proof. Let the form/ be positive definite, i. e., reducible to the 
normal form 

f ~ yl -r yl + • • * ~r yl (1) 

and let 

n 

yi~ aijXj, / = 1 , 2, . . . , /i {— ) 

j-=i 

with a nonzero determinant of the real coefficients Oij. If we want 
to substitute, into /, arbitrary real values of the unknowns .ri, 

Xn, at least one of which is nonzero, then we can first 

substitute them into (2) and then substitute the values obtained 
for all iji into (1). It will be noted that the values obtained from (2) 

for (/i, y. Un cannot all be zero at once, for then we would 

have that the system of homogeneous linear equations 

n 

^ aijXj = 0, i— 1,2, .«.,n 
j=i 

lias a nontrivial solution, though its determinant is different from 
zero. Substituting the values found for i/,, y., . . ., into (1), we 
get the value of the form / equal to the sum of the squares of n 
real numbers, not all zero. This value will consequently be strictly 

positive. . 

Conversely, suppose tlie form / is not positive definite, that is 

cither its rank or the positive index of inertia is less than n. This 

means that in the normal form of /. to which it is reduced, say, by' 
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the noDsingular linear transformation (2), the square of at least one 
of the new unknowns, say yn- is either absent altogether or is pre- 
sent with a minus sign. We will show that in this case it is possible 
to choose real values for the unknowns Xi, Xg* • • •» not all 
zero, such that the value of the form / for these values of the un- 
knowns is equal to zero or is even negative. Such, for instance, are 
the values for xi, x^, . . which we obtain when solving, by 
Cramer’s rule, the system of linear equations obtained from (2) for 
yi == = . . • = yn-i =0. Un = 1- Indeed, for these^ values of 

the unknowns Xi, Xj, . . ■» the form / is zero if yn does not 
enter into the normal form of /, and is equal to —1 if y^ enters into 
the normal form with a minus sign. 

The theorem that has just been proved is used wherever positive 
definite quadratic forms are employed. However, it cannot be used 
to establish from the coefficients whether a form is positive definite 
or not. This is handled by a difierent theorem which we will state 
and prove after introducing an auxiliary notion. 

Suppose we have a quadratic form / in n unknowns w-ith the 
matrix A — {au). The minors of order 1, 2, .... n of this matrix 
situated in the upper left corner, that is, the minors 

^12 • • • ^11 ^12 • • • ®Jn 

®11 ^12 ^ 2 l ^22 “ * ' ^21 ^22 • • • 


4A I 

^h\ ^f*2 • • • ^nn 

of which the last obviously coincides with the determinant of mat- 
rix A are called the principal minors of the form /. 

The following theorem holds true. 

A quadratic form f in n unknowns with real coefficients is posi- 
tive definite if and only if all Us principal minors are strictly positive. 

Proof. For n = i, the theorem is true since the form then is ox® 
and therefore is positive definite if and only if a > 0. For this rea- 
son, we prove the theorem for the case of n unknowns on the assump- 
tion that it has already been proved for quadratic forms in n — 1 
unknowns. 

Note the following. 

If a quadratic form / with real coefficients constituting a mat- 
rix A is subjected to a nonsingular linear transformation with a real 
matrix Q, then the sign of the determinant of the form {that is, the 
determinant of its matrix) remains unchanged. 

Indeed, after the transformation we obtain a quadratic form 
with the matrix Q'AQ; however, due to | ^' | = | (^|, 

IQ'AQ \ = \Q' \-\A \-\Q \ \ A \-\Q 

that is, the determinant j id | is multiplied by a positive number. 




176 


CH. 6. QUADRATIC FORMS 


Now suppose we have the quadratic form 


n 


/ = S 0,ijXiXj 
<.J=1 


It can be written as 


n— 1 

/ — (p (j/jt ^2« • • •’ ^n“l) ""i” ^ S 0 ifiX iXfi ”|”flri7i^n 

i=sl 



where (p is a quadratic form in « — 1 unknowns composed of those 
terms of form / which do not contain the unknown x^. The principal 
minors of the form cp evidently coincide with all principal minors 
of the form / except the last. 

Let the form / be positive definite. Then the form (p will also be 
positive definite: if there existed values of the unknowns .ri, 

not all zero, for which the form (p receives a nonstrictly 

positive value, then, additionally assuming Xn = 0, we would also 
obtain, by (3), a nonstrictly positive value of the form /, although 
not all the values of the unknowns Xu x^, . . Xj^-u Xn are equal 
to zero. For this reason, by the induction hypothesis, all the prin- 
cipal minors of the form (p that is, all the principal minors of the 
form /, except the last, are strictly positive. As for the last principal 
minor of / (that is the determinant of the matrix A itself), its posi- 
tivity is a consequence of the following reasoning: because of its 
positive definiteness, form / is reduced by a nonsingular linear trans- 
formntionto a normal form consisting of n positive squares. The deter- 
minant of this normal form is strictly positive, and so, by the remark 
made above, the determinant of the form / itself is positive. 

Now let all the principal minors of the form / be strictly positive. 
From this follows the positivity of all the principal minors of the 
form fp, that is. by the induction hypothesis, the positive definiteness 
of this form. Therefore, there is a nonsingular linear transformation 
of the unknowns xi, Xo, . . ., Xn-i such that reduces the form cp 
to a sum of w — 1 positive squares in the new unknowns //i, ?/«,..• 

. . .. ijn-i. By setting X;, = ?/„, this linear transformation may be 
completed to form a (nonsingular) linear transformation of all the 
unknowns Xi, x,, . . x^. By (3), form / is reduced by the indica- 

ted transformation] to 

^ !/i -2 ^ binUiyn-rhnJfn (4) 

i=l i=l 

The exact expressions of the coefficients bm are not essential to us 
Since 

in r -binViUn = iHi - ~ bl^l/l 
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it follows that the nonsingular linear transformation 

~ Vi t = 1, 2, . . n 1, 

2n = Vn 

reduces the form / by (4) to the canonical form 

/ = ^ 2 ^h cz\ (5) 

i=l 

To prove the positive definiteness of the form /, it remains to 
prove that the number c is positive. The determinant of the form 
in the right member of (5) is equal to c. However, this determinant 
should be positive since the right side of (5) is obtained from / by 
two nonsingular linear transformations, and the determinant of 
the form / was positive (being the last of the principal minors of 
this form). 

This completes the proof of the theorem. 

Example 1. The quadratic form 

/ = -}- Sxj + 4 xjr 2 — 8x1X3 — ''1x2X3 

is positive definite since its principal minors 



are jiosilive. 

Example 2. The quadratic form 

f= 3x1+ Xli- 5^3 + '''^1^2 - 8^113 - 4X2X3 

is not positive definite since its second principal minor is negative: 

3 2 ^ 

2 1 

Note tlmt by analogy with positive definite quadratic forms we 
can introduce negative definite forms, that is, nonsingular quadratic 
forms with real coefficients who.se normal form contains only nega- 
tive squares of the unknowns. Singular quadratic forms whose 
normal form consists of the squares of one sign are sometimes termed 
semidefinite. Finally, indefinite quadratic forms are those whoso 
normal form contains both positive and negative squares of the 
unknowns. 
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CHAPTER 7 


LINEAR SPACES 


29. Definition of a Linear Space. An Isomorphism 

The definition of an n-dimensional vector space given in Sec. 8 
began with a definition of an n-dimensional vector as an ordered 
set of n numbers (n-tuple). For n-dimensional vectors we then intro- 
duced addition and multiplication by scalars, which is what led 
to the concept of an n-dimensional vector space. The first instances of 
vector spaces are collections of vector segments emanating from a 
coordinate origin in the plane or in three-dimensional space. Howe- 
ver when we encounter such cases in geometry, we do not always 
find it necessary to specify the vectors via their components in some 
fixed system of coordinates, since both addition of vectors and their 
multiplication by a scalar are determined geometrically, irrespec- 
tive of the choice of any coordinate system. Namely, the addition 
of vectors in the plane or in space is accomplished by the paralle- 
lof^ram rule, while the multiplication of a vector by a scalar a signi- 
fies a stretching of the vector by the factor a (the direction is rever- 
sed if a is negative). It is advisable to give a “coordinateless*’ de- 
finition of a vector space in the general case as well. By this is meant 
a definition which does not require specifying vectors by ordered 
sets of numbers. We now give such a definition. This definition is 
axiomatic; nothing will be said about the properties of a separate 
vector, but we will enumerate the properties of operations invol- 
ving vectors. ^ . 

Suppose we have a set V. We denote its elements by lower-case 

Latin letters; a, b, c, . . Now, in sot V we define the operation 
of addition, which associates every pair of elements a, b in V with 
a uniquely defined element a b in F, called the sum, and the 
operation of multiplication by a real number (scalar); the product eta 
of element a by a scalar a is uniquely defined and belongs to V. 

The elements of V will be termed vectors, and V itself will be 
called a real linear (or vector, or affine) space if the indicated opera- 
tions have the following properties (I to VIII). 

• In contrast to Chapter 2, liere and in the sequel, vectors will be desig- 
nated by lower-case Latin letters, scalars by lower-case Greek .letters. 
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I. Addition is commutative: a b = b a. 

II. Addition is associative: (a b) -{• c = a (b c). 

III. There is a zero element 0 in V which satisfies the condition: 
a -1- 0 = a for all a in V. 

Using I it is easy to prove the uniqueness of ike zero element: if 
0i and 0« are two zero elements, then 

0l + ^2 “ 

Oj Oo = Ot “f“ Oj -- O 2 

whence Oj = O 2 . 

IV. For any element a in I' there exists an opposite {inverse) ele- 
ment — a, which satisfies the condition: a + ( — a) = 0. 

Using II and I, it is easy to prove the uniqueness of the inverse 
element: if (— fl)i end {—a)z are two inverse elements of a, then 

(— a)i + Ia + (-a) 2 l (— «)i + 0 = {—a),, 

l{— a)i + al + (— 0)2 = 0 + (—0)2 = (— 

whence (— 0)1 = (— 0 ) 2 - 

From axiom.s I to IV we deduce the existence and uniqueness of the 
difference a — b, that is, an element which satisfies the equation 

b ^ X = a (1) 

We can set 

a — b = a {—b) 

since 

6 + [a + + (“^)1 H a = 0 + a = a. 

Now if there is an element c such that satisfies (1), 

b - 'r c = a 

then, hy adding to both sides an element —b, we get 

c = a -)- ( — 6) 

Axioms V to VUI (cf. Sec. 8) relate multiplication by a scalar 
to addition and to operations involving scalars. Namely, for any ele- 
ments a, b in V, for any real numbers a, p, and for the real number 1, 
the following equalities must hold: 

V. a (a H- b) = aa + ab, 

VI. (a + P) a = aa + pa, 

VII. (ap) a = a (pa), 

VIII. l a = a. 


12 * 
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Elementary corollaries to these axioms are. 

[1] a-0 = 0 

For some a in V, 

(xa = o. {a +0) = aa + a - 0 

that is 

a-0 = aa — a£Z = aa + I— (aa)] = 0 
[2! 0- 0 

where the zero on the left is the number zero and the zero on the 

right is the zero element of V. 

To prove this, lake any scalar a. Then 

aa = (a + 0) a = aa 0*a 

whence 

0 - a aa — aa — 0 

[3! If aa 0, then either a = 0 or a = 0. 

If ct 9 ^: 0 , that is the scalar a“‘ exists, then 

a — \ a -= (a'*a) a = a"' (aa) = a"^-0 = 0 

( 4 ) a(-a)= -aa 
Indeed, 

aa a (—a) “ a [a + (—a)] = a-0 = 0 
that is, the element a (-a) is the inverse of aa. 

(5) (-«)a 
Indeed, 

aa ^ (-a) a - [a + (-a)l a = 0-a - 0 
(hat is. the element (—a) a is the inverse of aa. 

KJ) a (a — h) - - aa — ah 

lly 141. 

a {a - h) - a la T {-h)\ aa i- a (-/>) 

- aa + {—ab) ^ aa — ab 

[71 (a - fl) a ^ aa - (la 

Indeed, 

(a - f>) a [a -f- (-p)l o aa + (-fi) a 

= aa -f- { — pa) -= aa — Pa 

Tlmse nxionis :ind tiieir corollaries will he used from now on with- 
out any special ifsorvalions. 
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The definition given above is for a real linear space. If we assu- 
med, in V, multiplication not only by real numbers but fjso by arbi- 
trary complex numbers, then, retaining Axioms I to VIII, we would 
have the definition of a complex linear space. For the sake of defi- 
niteness, we will consider real linear spaces; however, everything 
in this chapter can be extended word for word to the case of com- 
plex linear spaces. . , m 

Examples of real linear spaces come to mind immediately. They 

include the n-dimcnsional real vector spaces composed of row vec- 
tors that wore studied in Chapter 2, also sets of vector segments 
emanating from a coordinate origin in the plane or m three-dimen- 
sional space if the operations of addition and multiplication by 
a scalar are understood in the geometric sense stated at the begin- 
We also have linear spaces that are infinite-dimensional. Let 
us consider all possible sequences of real numbers; they have 

the form 

Q = (etj^ OL^y • • 

We perform operations on sequences componentwise: if 

b = (pii ^2’ • ' •’ P'” ■ ' 

^ben 

a 6 = (ai + Pi, ct 2 + Pi’ . . ttn r Pn* • • •) 


On the other hand, for any real number y, 

ya = (Y^i* Y^ 2 ’ • • ’’ Y®n’ • • •) 

All the axioms from 1 to VIll are fulfilled, which means we have 

a real linear space. . • * 

Another instance of an infmite-dimensional space is the set of 

all possible real functions of a real variable if the addition of func- 
tions and their multiplication by a real number are to be understood 
as is conventional in the theory of functions, that is, as the addition 
or multiplication by the number of values of the functions for each 

value of the independent variable. 

Isomorphisms. Our immediate aim is to select from all linear 
spaces those which it will be natural to call finite-dimensional. 

First let us introduce a general concept. 

In the definition of a linear space we spoke about the properties 
of operations involving vectors, but we said nothing about the pro- 
perties of the vectors themselves. Thus, it may happen that although 
the vectors of two given linear spaces are quite different as to their 
nature, the two .spaces are indistinguishable from the standpoint of 
the properties of the operations. The exact definition is as follows. 

Two real linear spaces V and V' are called isomorphic if a one- 
to-one correspondence can be set up between their vectors: every 
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vector a of y is associated with a vector a' of V', the image of the 
vector a', different vectors from V possess different images and every 
vector in V serves as an image of some vector in F; and if in this 
correspondence the image of a sum of two vectors is the sum of the 
images of the two vectors, 

{a + h)' ^ a’ 4- b' (2) 

and the image of a product of a vector by a scalar is the product 
of the image of tlm vector by that scalar, 

[fxa)' = aa' (3) 

The one-to-one correspondence between spaces V and V which 
satisties tlie conditions (2) and (3) is called an isomorphic correspon- 
dence. 

Tims, the space of vector segments (in a plane) emanating from 
a coordinate origin is isomor|»hic to a two-dimensional vector space 
niiide up of ordered pairs of real numbers: we obtain an isomorphic 
coiTes[)ondence between these spaces if in the plane we hx some sy- 
stem of coordinates and associate with every vector segment an or- 
dered pair of its coordinates. 

Let us prove the following properly of an isomor[)liisjn of linear 
spaces: the image of zero of the space F is the zero of the space F' in 
an isomorphic correspondence between V and V. 

lad a l)e some vector in V and o' its image in F'. Then, by (2), 

«' ^ {a 4- 0)' - a' 4- 0' 

That is to say. 0' is a zero of the space 


30. Finite-Dimensional Spaces. Bases 

As the reader can verify without difficulty, the two definitions 
of linear depemhuice of row vectors given in Sec. 9. and also the 
proof of the ecpiivaleiice of these definil ions, employ only operations 
on vectors and for this reason can be carried over to the ca.se of any 
linear spaces. Consequently, in aximuatically defined linear spaces 
we can speak of linearly independent systems of vectors, of maxi- 
mal linearly independent systems, if such exist, and .<o on. 

If the linear spaces V and V' are isomorphic, then the system of 

rectors a., a,, in F is linearly dependent if and only if the 

system of their images a',. a\ a',, in F' is linearly dependent. 

Note lliat if the correspondence a — ► a' (for all a in F) is an 
isonior[)hic correspondent' between I* and F', then the rever.'^c cor- 
ri'spondence a' — ► a will also be isomorphic. It is therefore suf- 
ficient to consider the ca.se when the .system a.^ is linearly 

di pendent. Let there be scalars ct ,, . . ., not all zero, such 
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that 


a,ai 




ahflft = 0 


In the isomorphism under consideration, the image of the right 
member of this equation is, as tve know, the zero 0 of space F 
Taking the image of the left member and applying (2) and (3) several 

times, we get 

OL^a'i -r -h . . . + 0-h<ih = 0 

Thus, the system a; also linearly dependent. 

Finile-dimensional spaces. A linear space | is called finite-di 
mensional if in it we can find a finite maximal linearly independent 
system of vectors; any such system of vectors will be termed the 

A^finUnime^nsTonal linear space can have many different bases. 

Thus in the space of vector segments in the plane any pair of \qc- 

tors different from zero and not lying on one straight line can serve 

Is l usis Note that so far our definition of a hnite-dimensional 
as a Basis. 1 whether there can exist, in this space, bases 

rsisSoU dKr/nl''numher of vectors. Wh.y is more it might 
everbe Lumed that in some f.nite-dimensional sp-aces Uiere e.v. t 
bases wVh an arbitrarily large number of vectors. Let us investigate 

this situation. ,, , . 

Suppose a linear space V has a basis 

e j , ^2 , • • • t ^ ) 

• .• If rt is an arbitrary vector ill V. tlieri from the 

mTximalfly of the linearly independent system (1) it follows that 
a is expressed linearly in terms of the system: 

On the other hand, due to the linear independence of (1), expres- 
sion (2) will be unique for the vector a: if 

a = a[ci “T ^2^2 

(ccj — a[) ei 1- («2 — “D e.. -i- . . . -f {a„ — «„) = 0 

a, =a;. i = 2 « 


Thus, the vector a is associated one-to-one with the row 
' (tti> Ct2^ • • •> ^n) (3) 

of coefficients of its expression (2) in terms of the basis (1) or, as 
we shall say the roiv of Us coordinates in the basis (1). Conversely, 
every row of ’type (3), that is, any n-dimensional vector in the sense 
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of Chapter 2 serves as a row of coordinates in basis (1) for some vector 
of space V, namely, for the vector written in the form (2) in terms 
of the basis (1). 

We have thus obtained a one-to-one correspondence between all 
vectors of the space V and all vectors of an n-dimensional vector 
row-space. We will show that this correspondence, which quite natu- 
rally is dependent on the choice of the basis (1), is isomorphic. 

In space V let us, in addition to vector a, which is expressed in 
terms of the basis (1) in the form (2), also take a vector b whose 
expression in terms of the basis (1) is 

b -'i- p2^’2 + . - . -r 

Then 

a -f 6 = (CC| pi) Cl + (OLo -f- P 2 ) e- + • - - + (C£n “i" Pn) 

that is, the sum of the vectors a and b corresponds to the sum of the rows 
of their coordinates in the basis (1). On the other hand, 

ya = (voti) e, -f- (ycia) ^2 + - • ■ + (V“n) 

that is, to the product of a vector a by a scalar y corresponds the product 
of the row of its coordinates in the basis (1) by the same scalar y. 

The foregoing proves the following theorem. 

Any linear space with a basis consisting of n vectors is isomorphic 
to an n-dimensional vector row-space. 

As we know, in an i.somorphic correspondence between linear 
space.*^, a linearly dependent system of vectors goes into a linearly 
dejandent sy.stein and co[jversely; for this reason, a linearly inde- 
pejident system goes into a linearly independent system. From this 
it follows tliat in an isomorphic correspondence, a basis goes into a basis. 

Indeed, let a basis Cj, t’o. . . ., of a space V go (under an iso- 
morphic correspondence between the spaces V and V') into a system 
of vectors e\, tv, . . ., K, of space V', which, though it is linearly 
itideiiondcnt, is not maximal. Consequently, in V we can find a 
vector /' such that the system ej, ej, . . e'n, f remains linearly 
indej)eiidenl. However, the vector /' in this i.somorphism serves as 
an image of some vector / in V. We find tliat the system of vectors 
cj. Co, . . ., / must he linearly independent, which is in contra- 

dict i(m to tile definition of a ba.sis. 

Furtlier, we know (see Sec. 9) that in an n-diinensional vector row- 
space, all maximal linearly independent systems consist of n vec- 
tors. that any system of n + 1 vectors is linearly dependent, and 
tliat any linearly independent system of vectors is contained in some 
maximal linearly independent system. Using the above-established 
[inqu'rlii'S of isomorphic correspondences, we arrive at the following 
rosulls. 

All bases of a finite-dimensional linear space V consist of one and 
the same number of vectors. If this number is equal to n, then V is 
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called an n-dimensional linear space^ and the number n is the 

dimension of this space. . , 

Any system of n + l vectors of an n-dimenswnal linear space is 

linearly dependent. j. - i 

Any linearly independent system of vectors of an n-dimenswnal 

linear space is contained in some basis of that space. 

It is now easy to verify that the above-indicated examples of 
real linear spaces-the space of sequences and the space of func- 
tions-are not finite-dimensional spaces: in each of these spaces Uie 
reader will easily find linearly independent systems consisting 

of an arbitrarily large number of vectors. a •* j- 

Relationships between bases. We are interested in finite-dimen- 
sional linear spaces. Clearly, when studying n-dimensional linear 
spaces we are actually studying the n-dimensional vector row-space 
tLt was introduced back in Chapter 2. Earlier, however, we extrac- 
ted one basis from this space, namely, the basis composed of unit 
vectors (these are vectors, one coordinate of which is equal to unity 
and all others are zero), all the vectors of the space were specified 
by the rows of their coordinates in that basis. Now, however, all 

bases of a space have equal status. 

Let us see how many bases can be found in an n-dimensional 

linear space and how these bases are interrelated. 

Suppose in an n-dimensional linear space V we have the bases 

. .. Cn (4) 


e,, e 




and 


l» ‘2’ 


fi 


(5) 


Each vector of basis (5), like any vector of the space V, is unambi- 
guously written in terms of basis (4) as 

n 

cl- V 

)=i 




i-1, 2, 


n 


m 


The matrix 


T = 


/ Tij • • • Tin 

It' 


whose rows are the rows of the coordinates of the vectors (5) in basis 
(4) is called the change-of-basis matrix from basis (4) to basis (5). 

Because of (fi), we can write the relationship between bases (4) 
and (5) and the change-of-basis matrix T in the form of a matrix 

equation: 


« • • 


n 


^11 M2 
T21 T22 


^n2 • • • 



( 7 ) 
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or, denoting by e and e', respectively, the bases (4) and (5) as columns: 

c' = Te 

On the other hand, if T' is the change-of-basis matrix from (5) 
to (4), then 


t 

e 


whence 


c = (rT)e, 

= {TT’) e' 

or, because of the linear independence of llie bases e and e\ 

T'T = TT = E 

whence 

T'' — 2"“^ 

This proves that the changc-of-basis matrix is always a nonsingular 
matrix. 

Any nonsingular square matrix of order n with real elements can 
;erve as a matrix for changing from a given basis of an n-dimensional 


real linear space to some other basis. 

Suppose we have a given basis (4) and a nonsingiilar matrix T 
of order n. For (5) take a system of vectors for which the rows of 
matrix T serve as the rows of coordinates in basis (4); thus, we have 
equation (7). The vectors (5) are linearly independent (linear depen- 
dence would have implied a linear deitendence of the rows of mat- 
rix T. in conflict witli its nonsingiilarity). Therefore, system (5), 
as a linearly independent system consisting of n vectors, is a basis 
of our space, and the matrix T serves as a change-of-basis matrix 
from basis (4) to basis (5). 

We see that in an n-dimensional linear space we can find as many 
distinct bases as tlu-re are distinct nonsingular square matrices of 
order n. True, here, two bases consisting of the same vectors but 
written in a different order are considered distinct. 

Transformation of vector coordinates. Suppose in an ;?-dimen- 
sional linear space we have the bases (4) and (5) given with the chan- 
ge-of-basis matrix T = (T:;), 

c' = Te 

Tot us lind the connection lietween the coordinate rows of an arbitra- 
ry vector a in these bases. 

Let 


n 

(I ---- y ajC;, 
n 

a ^ \ 

i -I 


(8) 
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Using (6) we find 


a 


n n 


a 


= S ai(!S To-o)= S O 

^ — j=t .-1 


i=l i=l 


Comparing with (8) and using the uniqueness of vector notation in 
terms of a basis, we obtain 


n 


ay=5cciT,y. 7— U 2, fi 

j=i 

Thus we have the matrix equation 

(Ctj, ttj, ■ • ■) Ctn) “ ('^1' * • *’ ^ 

Thus the row of coordinates of the vector a in the basis e is equal 
to the row of coordinates of this vector in the basis e' multiplied on the 
right by the chan ge^of -basis matrix from the basis e to the basis e . 
Whence clearly follows the equation 

(aj, ccji • ■ -t oil) ^ (^1’ “i ^ 

Example. Consider a llirep-dimeii>ional real linear space with the basis 


The vectors 


Cj. ^2i »'3 


= 5ci — ez — 2<’3, 

e'„ = 2et + 

el = -2f, -i- ^2 -h es 


(9) 


(10) 


also form a basis in this space, the matrix 

/ 5 -I -2 

r = 2 3 0 

\-2 1 1 

serving as the chaiige-of-basis matrix from (9) to (10). 

3-1 (i 
f-i = I --2 1 —4 

8 -3 17 


We then have 


The vector 


a = Cj T '4^2 — ^3 


tlmrelore has, in basis (10), Iho row of coordinates 

3 -1 6 


(af 


.a-,. 


13, 6, -27) 


or 


= -13«; -1- Gfi - 27fi 
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31. Linear Transformations 

In Chapter 3 we dealt with the concept of a linear transforma- 
tion of unknowns. The concept we now introduce bears the same 
name but is different in character. True, certain relationships could 
be established between these two notions. 

Let there be given an /j-dimensional real linear space, which 
wo denote by Vn- We consider a transformation of this space, that 
is a mapping which takes every vector a of into some vector a 
of the same space. The vector a' is called the image of a under the 
given transformation. 

If we use (p to denote the transformation, then the image of vec- 
tor a will be written as a<p instead of the more customary (p (a) or 
(pa. Thus, 

a — arp 

A transformation (p of a linear space is called a linear transfor- 
mation of this space if it takes (he sum of any two vectors a, b into 
the sum of the images of (Iiese vectors: 

(a 6) (p = fl(p + 6(p (1) 

and the product of any vector a by any scalar a into the product 
of the image of the vector a by that .«ame scalar a: 

(aa) (p = a (a(f) (2) 

From this definition, it immediately follows that a linear trans- 
formation of a linear space carries any linear combination of given 
rectors aj, a., a^ into a linear combination {with the same coef- 

ficients) of the images of the rectors: 

(ctjr/i -f- -r . . . 0 :^ 0 }.) (p 

^ cti (a,fr) a, (a.(|) -p . . . (OftCp) (3) 

I.(‘t ns jirove the following assertion. 

Under any linear transformation of a linear space the zero 
vector 0 remains fixed, 

0(j 0 

and the image of the inverse of the given vector a is a vector that is inverse 
to the image of a: 

(-«) H 

Indeed, if b is an arliitrary vector, then, by (2), 

0<(- - (0./>) (j. = 0 (/up) - 0 

On the other hand, 

(_a) (p = [{—1) al (f, = (_1) (flq,) = — atp 
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The concept of a linear transformation of a linear space arose 
as a generalization of the familiar analytic geometry concept of 
the affine transformation of a plane or of three-dimensional space. 
Indeed, conditions (1) and (2) are fulfilled under affine transforma- 
tions These conditions are also fulfilled for projections of vectors 
on a plane or, in three-dimensional space, on a straight line (or a 
plane) Thus, for example, in a two-dimensional linear space of 
vector segments emanating from the origin of the plane, the trans- 
formation carrying a vector into its projection on some axis passing 

through the origin is a linear transformation. 

Examples of linear transformations in an arbitrary space 
are the identity transformation e. which leaves every vector a fixed, 


nf! ^ n 


and the ziro transformation o, wliich maps every vector a into zero, 

dO) = 0 

We will now obtain a survey of all linear transformations of 
a linear space Vn- Lt'f 


he a basis of this space. As we have already done, denote by e the 
basis (4) arranged in a column. Since any vector a of the space 
is uniquely represented as a linear combination of vectors of the 
basis (4), it follows, by (3), that the image of vector o with the same 
coefficients can be expressed in terms of the image,s of the vectors 4). 
In other words, any linear transformation tp of is uniquely deter- 
mined by specifying the images 0 / all vectors of 

the fixed basis (^ 1 ). , , . x * > ^ t / 

No matter what the ordered system of n vectors of F„, 

Cj, Cni • • • ’ Cfj (5) 


there is a unique 
serves as the system 
formation, 


linear transformation (p of this space suck that (5) 
of images of the vectors of basis (4) under this trans- 

ej(f — Cj, / = 1, 2, . . ., n (6) 


The uniqueness of the transformation (p has already been proved; 
it remains to prove its existence. Let us define the transformation ip 
as follows: if a is an arbitrary vector of the space and 


a 


n 




is its notation in the basis (4)i then put 

n 

acp - S 

i=l 


( 7 ) 



190 


CH. 7. LINEAR SPACES 


Let US prove the linearity of this transformation. If 

h = 2 

i=l 

is any other vector of the space, then 

(fl-f-6)(p = [2 2 (ai-l-Pi)ci 

i=l i=i 

n n 

= 2 n- S 

t==l i=l 

But if y is any scalar, then 

u n n 

(Tfl) tp 1 2 (T'=‘f) = S (v«i) = Y 2 = Y (^‘P) 

i=l t=l j«l 

The correctnc.ss of (6) follow.s from the definition (7) of the trans- 
formation cp, since all coordinates of the vector ei in the basis (4) 
are zero (except the /ih coordinate, which is equal to unity). 

We have thus established a one-to-one correspondence between all 
linear transformations of the linear space Vn and all ordered systems 
(5) made up of n vectors of this space. 

However, every vector has a definite notation in the basis (4): 

n 

Ci=2«iy'V- f=l, 2 , ...,« (8) 

;=i 

W’e can form a square matrix of the coordinates of the vector Cj 
in the basis (4) 

-■1 = («,v) (9) 

taking for its fth row tiie row' of coordinates of the vector c,-, i = 
= 1. 2, . . n. Since system (5) was arbitrary, the matrix A will 
be an arbitrary square matrix of order n with real elements. 

]['V thus have a one-to-one correspondence between all linear trans- 
formations of the space V,t and all square matrices of order n; this cor- 
respondence is of course dependent on the choice of basis (4). 

We shall say that the matrix .4 specifies a linear transformation (p 
in the basis (4) or. more succinctly, that A is the matrix of the linear 
transformation tp in the basis (4). If by eip we denote a column com- 
posed of the images of the vectors of (4), then from (6), (8) and (9) 
there follows a matrix equation which completely describes the re- 
lationships existing between the linear transformation 9, the basis e 
and the matrix A specifying the linear transformation in that basis; 

ecp = Ae (10) 
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Let us show how, knowing tiie matrix of a linear transforma- 
tion (p in basis (4), it is possible, via the coordinates of the vector a 
in this basis, to find the coordinates of its image acp. If 

7i 


then 

n 

a(p= V aiiCiH) 
j=i 


which is equivalent to the matrix equation 


Ulilizinir (10) and taking into account that the associativity ol 
matrix multiplication is easy to verify wlien one of the matrices, 
is a column made up of vectors, we obtain 


flq> = l(ai, a 


2’ 


ctj -41 


Whence it follows that the row of coordinates of a vector atp is equal 
to the row of the coordinates of the vector a mu tiplied on the right by 
the matrix A of the linear transformation q., all in the basis (4). 


FxamDle lot llu-ro bo a Unrar lran.<fonnation (p siven by the following 
matWx " a ba;is 1.: .2. ^3 of throe-dinu-nsional linear .pace: 



1 OV 
3 2 
-4 1/ 


fl = 5^1 + — 2^3 

then ^ ^ ^ 

(5. 1.-2) f i 3 2)=(-9. IG. 0) 

V 0 -4 1/ 

tlial 

afp = — 9«’i + 16^2 

Relationships between matrices of a linear transformation in 
different bases. Quite naturally, a matrix specifying a linear leans- 
formation is dependent on the choice of the basis. We will show 
what the relationship is between matrices that specify one and the 

same linear tran.sformation in different bases. . 

Let there be given the bases e and e with change-of-basis 
matrix T, 

e' = Te (11) 
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and let the linear transformation (p be given in these bases by matri- 
ces A and A', respectively, 

e(p = Ae, c'cp — A'e* (12) 

By (11), the second equation of (12) reduces to 

(Te) <p = A' (Te) 

However, 

{Te) (p = r (etp) 

Indeed, if (xa, Xj,, • ■ tm) >s tl>e ith row of matrix 7\ then 

(TjiCj TjoCo ~f” • • • “ T 

Tfi (Cifp) -r T,'2 (Cfllp) + ■ • • + "^in (^n9^) 

Hence, by (12), 

(Te) (p - r (cff) - T {Ae) = {TA) e, 

A' (Te) - (A’T) e 

that is, 

{TA)e {A'T)e 

If fur at least one i, 1 < f < tu the /th row of the matrix TA is 

different from the Uh row of the matrix .4'7’, then two distinct 

linear combinations of vectors Ct, e 2 - . . Cn equal to each 

otlier, which contradicts the linear indejiendenco of the basis e. 

Tims. 

TA = A’T 

whence, due to the nonsingularity of tiie change-of-basis matrix T, 

A’ -- A - r-U'7 (13) 

Note that the square matrices H and C are called i^imilar if they 
are euimecled by the equation 

C ^ 

where Q is some nonsingiilar matrix. We say that the matrix C is 
ohlained from H hy a transformation by the matrix Q. 

The equaliuns (12) proved above may be formulated as an impor- 
tant theorem. 

Matrices which represent one and the same linear transformation 
in different hasrs arc similar. And the matrix of the linear transfor- 
mation cp in the ba.<is c’ is ohlained by transforming the matrix of this 
linear transformation in the ha.sis e via the chan^e-of-basis matrix 
from basis e to basis e. 
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Let us piont out that if a matrix A represents a linear transfor- 
mation (p in the basis e, then any matrix B. similar to A, 

B = Q-'AQ 


I 


also represents the transformation (p in some basis, namely, in the 
basis obtained from e by means of the change-of-basis matrix (? 

Operations on linear transformations. .Associating to every linear 
transformation of the space V„ its matrix in a lixed basis, we obtain 
(as was proved above) a one-to-one correspondence between all li- 
near transformations and all square matrices of order «. It is natural 
to expect that the operations of addition and multiplication of ma- 
trices and also matrix multiplication by a scalar will be associated 
with analogous operations involving linear transformations. 

Suppose we have the linear tran.sforniations ip and i|) in a space F„, 
The sum of these transformations is the transformation ip -(- i|; do- 

fined by the equation 

a ((p + ^ 4- 

It thus carries any vector a into the sum of its images under the trims- 

formations w and ij). t i j r n ^ . 

T)ie traZformation ip is linear. Indeed, for all \eclors a 

and b and any scalar a, 

(a + b) (q) + Ip) = ^) '1 ' ~ 

= a(p + fl'P f 6ip = a ((p 4- ip) 4- b (<p - \[). 

{aa) ((p 4- Ip) = (a«) 9 ^ ^ ® 

= a («(p — '^'1') = a la ((p + t)! 

On the olher hand, we use the term -producr of linear transfor- 
mations (p and Ip for the transformation (pip defined by the equation 

a (tpij)) ■ (atp) ^ t*’^) 


that is, the transformation obtained by successive application of the 

transformations <p and ip. 

The transformation <pip is linear: 

{a + b) ((pip) = l(« 4- b) (pi 111 - (aq‘ 4 ^^‘P) ^ 

^ (a(p) il> 4- (bfi') Ip = a (tpip) + b {(fip), 

(aa) ((pip) I(aa) ip = '1^ = « Katp) = a [a ((fiij^l 

Finally we use the term •^producT* of a linear transformation (p 

by a scalar x for the transformation xqi defined by 

a (x(p) = X (a(p) (Ki) 

Thus, in the (p-transfonnation of all vectors, the images arc multi- 
plied by the scalar x. 


13-98C 


194 


CH. 7. LINEAR SPACES 


The transformation xcp is linear: 

{a -j- h) (x<p) = X I(a -f- 6) (pi = X (acp + b(p) 

= X (a(p) + X (6(p) = a (x<p) + b (x(p) 

(aa) (x(p) = xf{aa)(pl = x[a(a9)J = 

a[x(a(p)] = a[a(xcp)] 

Let the transformations (p and i|) be given in the basis et, . . . 
. . by the matrices A — {aif) and B — respectively, 

e(p = Aot — Be 

Then, by (14), 

n n n 

et (cp -f- ^^) = 4 gfip = 2 aijej + 2 = S (“O- -r Pi>) 

j=i j«i 

that is, 

e (9 + ip) = (^ -(- 5) e 

Thus, the matrix of a sum of linear transformations in any basis is 
equal to the sum of the matrices of these transformations in the same 
basis. 

On the other hand, by (15), 

n n 

a {9tp) = (c,-9) (p = ( 2 = S a,-; (^/t) 

3=1 j=i 


tliat is, 


= S ( S = S ( 2 aoP;ft) 

;=1 fc=I ft=l j=I 

e (99) = {AB) e 


In other words. Uie matrix of a product of linear transformations in 
any basis is equal to the product of the matrices of these transformations 
in the same basis 

Finally, duo to (16), 


that is, 


n n 

Ci (x(f) = X ((?i(p) = X 2 ^ij^j = 2 

3-1 3=1 


e (x9) = (x/1) e 

Consecpjcntly, a matrix which in some basis specifies the product of 
a linear transformation 9 6y a scalar x is equal to the pro^ct of the 
matrix of the transformation 9 in this basis by the scalar x. 

From the results obtained it follows that operations on linear 
transformations possess the same properties as operations on matri- 
ces. Thus, the addition of linear transformations is commutative 
and associative, while multiplication is associative but is not com- 
mutative for n >► 1. For linear transformations there exists unique 
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subtraction. Also note that in linear transformations, the identity 
transformation e plays the role of unity, and the zero transformation ta, 
the role of zero. In any basis, the transformation e is given by the 
unit matrix, and the transformation wis given by the zero matrix. 

32. Linear Subspaces 

A subset L of a linear space V is called a linear subspace of this 
space if it is a linear space with respect to the operations defined 
in V of addition of vectors and the multiplication of a vector by 
a scalar. Thus, in three-dimensional Euclidean space, the collection 
of vectors emanating from the coordinate origin and lying in some 
plane (or on some straight line) passing through the origin is a linear 

subspace. 

For a nonempty subset L of space V to be a linear subspace of I , 

the following requirements must be met. 

1. If the vectors a and b lie in L, then the vector a + b also belongs 

to L. 

2. If the vector a belongs to L, then the vector aa, for any value of 
the scalar a, belongs to L too. 

Indeed, by Condition 2, the set L contains the zero vector: if 
vector a belongs to L, then L also contains O a = 0. Furthermore, 
again by Property 2, L contains a vector a and the inverse vector 
= (— l)-a, and therefore, due to Property 1, L also contains 
the difference of any two vectors in L. As to all the other require- 
ments that enter into the definition of a linear space, we can say that 
if they are fulfilled in V, then they will likewise ho fulfilled in L. 

Instances of linear sub.spaces of the space V are. the space V 
itself and also the set consisting of a single zero vector, the so-called 
zero subspace. A more interesting example is the following: in the 
space V take any finite system of vectors 

flj, On, » • (1) 

and denote by L the set of all those vectors which are linear combina- 
tions of the vectors of (1). VVe will prove that L is a linear suhspace. 
Indeed, if 

b = ttiOj J' CC-^a^ -r • • - "I' CCr^r> ^ “ f^l^l “1“ ^ 2^2 4" . . . 
then 

6 -p C = {«! -f pi) fll (^2 -f- P2) «■■ + ••• -r (ctr + pr) flr 

that is, the vector b + c belong.s to L; also in L is the vector 

yO = iyai) ai + (v«2) «2 4 • • ■ + (yar) dr 
for any scalar y. 

We say that this linear .subspace L is generated by the system of 
vectors (1); in particular, the vectors (1) themselves belong to L. 

13* 
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Incidentally, any linear subspace of a finite-dimensional linear 
space is generated by a finite system of vectors, for if it is not a zero 
subspace, then it possesses a finite basis. The dimension of the linear 
subspace L is not greater than the dimension n of the space itself 
and is equal to n only when L = V^. The dimension of the zero 

subspace is of course the number 0. 

For any k, 0<k<n, in the space there are linear subspaces 
of dimension k. It is sufficient to take a subspace generated by any 
system of k linearly independent vectors. 

Let there be given linear subspaces Lj and in the space V. 
The collection Lq of vectors belonging both to Li and to will 
be a linear subspace, as can readily be verified. It is the intersection 
of the subspaces Li and L^. On the other hand, another linear sub- 
space is the sum Z of the subspaces Li and L„, or the collection of all 
those vectors in V which can be represented as a sum of two terms, 
one from and the other from If the dimensions of t^he subspa- 
ces Li, L 2 , Lq and L are, respectively, di, rfj, do and d, then the 
following formula holds: 

d = d,- d., - do (2) 

which is to say that the dimeJision of the sum of iuv subspaces is equal 
to the Slim of the dimensions of these subspaces diminished by the dimen- 
sion of their intersection. 

To prove this, let us take an arbitrary basis 


of subspace Lq and augment it to obtain the l)asis 

j , (In. . . •• ^do I ' • • •♦ ^dl 

of the subsjtace Li and also augment it to obtain the basis 

(ii. a • • • » ♦ 


( 4 ) 

( 5 ) 


of the subspace L.. I tilizing the definition of the subspace L, it is 
easy to see that this subspace is generated by the system of vectors 

a. bj,, \ b,!^. c,/,) 1, . . Cdz ( 6 ) 

Formula {-) will thus be proved if we demonstrate the linear inde- 
pendence of system (d). 

Suppose t!ie equation 

ci-yO z 1 “' • • • "I" Pdi^di 


Vdn iCi.i-l - - . . . -r - 0 

with certain numerical coefficients is true. Then 

(I = UnOn (X..j„(I.Jn "r Pdo - • “I" Pdi^di 

= — Vdo — ... — yizCds 0) 
The left, member of this equation lies in h, the right member hi Ln, 
therefore vector d (which is equal both to the left and to the right 
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member of this equation) belongs to Lo and, consequently, Ciin 
be expressed linearly in terms of the basis (3), However, the right 
member of (7) shows that the vector d can also be expressed linearly 

in terms of the vectors c*+i c.h- Whence, by the linear 

independence of system (5), it follows that all he coefficients 
Va+i. • • •. Van are zero, that is, that d = 0; but then, because of 

the linear independenceof system (4), all the coefficients a,, , . ., aa„, 

[5,,+, P,. are also zero. This proves the linear independence 

*Th(f reader can verify that our proof holds true for the case when 

the subspace Lo is a zero suhspace. i.e., d„ = 0- 

The range of values and the kernel (null space) of a linear rans- 
formation. Suppose we have a linear transformation q in a linear 
space y„. If L is any linear suhspace of the space l„, then the col- 
lection U of images of all vectors of L under the transformation if 
will also be a linear subspace, as follows directly from the definitions 
of a linear subspace and a linear transformation. In part cular, the 
collection F„(p of images of all vectors of the space V„ is » Imi-” 
space. It is called the range of la/ncs of the transformation q. Lt 
us find the dimension of the range. To do this, note 
matrices representing the transformation q, in ^ " 

similar, It follows, due to the last theorem of Sec. 14, that thc> . 1 
have one and the same rank. This iiuuiher can therefore be termed 

the rflwit of the linear transformation (p. 

The dimension of the range of values of a linear tiansformation cp 

is equal to the rank of the transformation. 

Indeed, let cp be represented in the basis b> 

matrix a! The suhspace is generated by the vectors 

e,q), ejtp CnT 

and therefore, as a particular case, any maximal linearly indepen- 
dent subsystem of .system (8) will serve as a basis of the subspace 
V„w. However, the maximum number of linearly independent vec- 
tors in system (8) is equal to the maximum number o linearly inde- 
pendent rows of the matrix A. i.e., it is equal to the rank of the 

matrix. The theorem is proved. n • 

We know that under the linear transformation 9 the zero vector 

goes into itself. The collection N (9) of all vectors of the space F,. 
which under 9 arc mapped into the zero vector is consequently non- 
void and is evidently a linear subspace. Tins subspacc is termed 
the null space of the 'transformation 9, and its dimension is called 

the nullity of this tramsformation. 

For QTiy linear trafisforffiation cp of space vny sum of the rank 
and of the nullity of the transformation is equal to the dimension n of 

the whole space. 
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Indeed, if r is the rank of the transformation cp, then the^ sub- 
space 7n(p has the following basis of r vectors: 

fli, fljj • • •» ( 9 ) 

In Vji we can select the vectors 

bu bn, . . bj. ( 10 ) 

such that 

bi(p = Oj, i — 1, 2, . . r 

The choice of vectors (10) is not unambiguous, naturally. If some 
nontrivial linear combination of vectors (10) were mapped into 
zero by the transformation qj, in particular, if the vectors (10) were 
linearly dependent, then the vectors (9) would themselves be linearly 
dependent, but this runs counter to our assumption. And so the 
linear subspace L generated by the vectors (10) has dimension r 
and its intersection with the subspace N (cp) is zero. 

On the other hand, the sum of the subspaces L and N (f) coin- 
cides with the entire space Vn- Indeed, if c is any vector of the space, 
it follows that the vector d — c(p of course belongs to the subspace 
Vn(p‘ Then in the subspace L there will be a vector b such that 

6(p = d 

The vector b is written in terms of system (10) with the same coeffi- 
cients as is the vector d in terms of the basis (9). From this we have 

c = b -r [c — 6) 

and the vector c — 6 is contained in the subspace N (cp), since 

(c — 6) (p = ccp — 6(p = d — cf = 0 

The assertion of the theorem follows from the results obtained 
and from the formula (2) that was proved earlier. 

Nonsingular linear transformations. A linear transformation cp 
of a linear space is called nonsingular if it satisfies any one of 
the following conditions, the equivalence of which follows directly 
from the theorems proved above. 

1. The rank of the transformation cp is equal to n. 

2. The entire space serves as the range of values of the trans- 
formation (p. 

3. The nullity of the transformation (p is zero. 

There are many other definitions of nonsingular linear transfor- 
mations that are equivalent to those given above, for instance, 
definitions 4 to 6. 

4. Distinct vectors of the space Vn have distinct images under 
the transformation q\ 
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Indeed, if a transformation cp has Property 4, then the null space 
of this transformation consists of the zero vector alone, i.e.. Pro- 
perty 3 holds. But if the vectors a and 6 are such that a ^ b, but 
a<p = bcp, then a — b ^ 0, but (a — 6) (p = 0, or Property 3 docs 
not hold. 

From 2 and 4 there follows 

5. The transformation 9 is a one-to-one mapping of the space 
onto this whole space. 

From 5 it follows that a nonsingular linear transformation 9 
has an inverse transformation (p"^ which carries any vector 09 into the 
vector a, 

(09) 9"* = a 

The transformation 9”* is linear since 

((29 -j- 69) 9"^ “ \{a -r 9 1 T"* = <? -r 6, 

[a {(79)1 9-^ = [(art) 9.1 9-' = art 

From the definition of the transforinalioii (i"’ it follows that 

99-* = 9-19 = e (II) 

The equalities (11) can themselves he viewed as a definition of an 
inverse transformation. Then from this and from the last results of 
the preceding section it follows that if a nonsingular linear transfor- 
fnation 9 is represented in some basis by the matrix A (which is non- 
singular due to Property 1), then the transformation 9"* is represented 
in that basis by the matrix A 

We thus arrive at the following definition of a nonsingular linear 
transformation. 

6. A transformation 9 has an inverse linear transformation 9-*. 


33. Characteristic Roots and Eigenvalues 

Let A = (a/;) be a square matrix of order n with real elements. 
On the other hand, let % be some unknown. Then the matrix A — IE, 
where £ is a unit matrix of order n, is called the characteristic matrix 
of the matrix A. Since in the matrix XE the principal diagonal is 
occupied by % and all other elements are zero, we have 



The determinant of the matrix A — \E is a polynomial in %. 
of degree n. Indeed, the product of elements on the principal dia- 
gonal is a polynomial in X with highest-degree term (— 1)"V‘; all 
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Ihe other terms of the determinant do not contain at least two of the 
number of elements on the principal diagonal; therefore, their degree 
in X does not exceed n — 2. It is easy to find the coefficients 
of this polynomial. For instance, the coefficient of is equal to 

^ . . . + ccnn) and the constant term coinci- 

des with the determinant of matrix A. 

The polynomial | i4 — X£ 1 of degree n is called the characteri- 
stic polynomial of matrix A, and its roots (which may be real or 
complex) are termed the characteristic roots of the matrix. 

Similar matrices have the same characteristic polynomials, and, 
consequently, ihe same characteristic roots. 

To see this, lot 

B = Q-^AQ 


Then, taking into account that the matrix XE commutes with the 
matrix Q. and | I ~ I "'p have 

\B - XE \ -- \ Q-XIQ - XE \ = \ (.1 ~XE}Q \ 

- \Q\-^-\A - XE \-\Q \ ^ \ A - XE\ 

'riio proof is complete. 

From this re.siilt it follows (by the theorem proved in Sec. 31 
on tile relationship between matrices representing a linear trans- 
formation in ilifferenl bases) that although the linear transformation 
(p may be represented in different bases by different matrices, all the 
matrices have one and the same set of characteristic roots. These roots 
can therehire be called the characterisiic roofs of the transformation tp. 
The set of those characteristic roots, eacli root being taken with 
the multiplicity that it has in the characteristic polynomial, is 
called the spectrum of the linear transformation (p. 

(lharacleristic roots play a very important role in the study of 
linear transformations, as the reader will have ample opportunity 
to see. We now investigate one of the apjilicnlions of characteristic 
root s. 

Let lliere he given a linear transformation in a real linear space 
1';;. If a vector b (nonzero) is carried by the transformation q? into 
a vector proportional to h. 


bq = Xob ( 1 ) 

wliere Ao is some real nnmher, then the vector b is called the eigen- 
vector of liie Iransformalinn (p, and the number Ao is the eigenvalue 
of this transformation. W’e say that the eigenvector b corresponds 
to the eigenvalue Xo- 

Note that since b ^ 0. the number Xo which satisfies Condi- 
tion (1) is uniquely delined for the vector b. Also bear in mind that 
the zero vector is not considered to be an eigenvector of the trans- 
formation (j, although it satisfies Coiidilion (1) for any Xq. 
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Rotation of the Euclidean plane about the origin through ai» 
angle that is not a multiple of n is an example of a linear transfor- 
mation which has no eigenvectors. An instance of another extreme 
case is the stretching of a plane in which all vectors issuing from 
the origin are stretched, say, five times. This is a linear transforma- 
tion and all nonzero vectors of the plane are its eigenvectors; all 

of them correspond to the eigenvalue 5. 

Only the real characteristic roots (if they exist) of a linear transfor- 
mation (p serve as eigenvalues of the transformation. 

Let a transformation ip have a matrix A = (ctj;) in the basis e\. 
$ 2 , . . Cn and let the vector 

n 

i=l 

be an eigenvector of the transformation 9 

6(p = hob (2) 

. As was proved in Sec. 31. 

6(p - 1(1^,. h (3) 

Equations (2) and (3) lead to the system of equations 

” p2^22 -r • ‘ * “T 


"r p2‘^2/j "r • • • 'i' ^Opn 

Since 6^0, not all the numbers P), Pj- • ■ -i P<i are zero. Ihu.s, 
by (4), the system of homogeneous linear equations 

(ctji — ?.o) •Ti ci 2 iX.> ■< O'j.jx,, — 0, 

(XiiXi (a.jv — ^0) -^2 "'■•••+ — 0 , 

t'V 

has a nontrivial solution and for tliis reason its determinant is equal 
to zero: 

CCjj — Xq, Ctoj, • • •’ ^nl 

CC,2i ^22 — • • •’ ^n2 

• 

Win* .... W„„ Xo 

Taking the transpose, we gel 

I A - Xo£: I - 0 



( 7 ) 
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that is to say, the eigenvalue Xq actually does prove to be a charac- 
teristic root (and, quite naturally, a real root) of the matrix A and, 
hence, of the linear transformation cp. 

Conversely, let Xq be any real characteristic root of the trans- 
formation (p and, consequently, of the matrix A. Then we have 
equation (7) and therefore equation (6), which was obtained from 
(7) by taking the transpose. From this it follows that the system of 
homogeneous linear equations (5) has a nontrivial solution, and even 
a real one, since all the coefficients of the system are real. If we denote 
this solution by 

(Pi. Pc, . . .. M (8) 

we have equations (4). Use b to denote the vector of space Vn having 
in the basis ej, . . ., the coordinate row (8). It is clear that 
b ^ 0. Then equation (3) holds and from (4) and (3) follows (2). 
Thus, vector b has proved to be an eigenvector (of the transforma- 
tion fp) corresponding to the eigenvalue Aq- This proves the theorem. 

Note that if we considered a complex linear space, then the 
demand that the characteristic root be real would be superfluous. 
In other words, we would have proved the following theorem: 
The characteristic roots of a linear transformation of a complex linear 
space, and only these roots, serve as eigenvalues of the transformation. 
Whence it follows that in a complex linear space, any linear trans- 
formation has eigenvectors. 

Returning to our real case, note that the collection of eigenvectors 
of the linear transformation cp which correspond to the eigenvalue A-o 
coincides with the collection of nontrivial real solutions of the 
system of homogeneous linear equations (5). Whence it follows that 
the collection of eigenvectors of the linear transformation (p which cor- 
respond to the eigenvalue Xo will, after the zero vector has been adjoined 
to it. he a linear subspace of the space Indeed, from what was 
proved in Sec. 12, it follows that tJie collection of {real) elutions 
of any system of Iwmogetieous linear equations in n unknowns is a li- 
near subspace of the space Vn. 

iJnoar transformations with a simple spectrum. In many cases 
it is necessary to know whether a given linear transformation (p can 
have a diagonal matrix in some basis. As a matter of fact, by far 
not every linear transformation can be represented by a diagonal 
matrix. The necessary and sufficient conditions for this will be indi- 
cated in Sec. 61. In the meantime we wish to indicate one sufficient 
condition. 

We will first prove the following auxiliary results. 

A linear ti'ansformation cp is represented by a diagonal matrix 
in a basis e\, e<i, .... if only if all the vectors of the basis are 
eigenvectors of the transformation «p. 
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Indeed, the equation 

ei(p = kiCi 

is equivalent to the fact that in the ah row of the matrix repre- 
senting the transformation in the indicated basis all ofi-diagonal 
elements are zero and the principal diagonal has the number ?,,■ (in 
the ith position). 

The eigenvectors 6,, b. of the linear transformation 9 

which correspond to different eigenvalues constitute a linearly inde- 
pendent system. 

We shall prove this assertion by induction with respect to k, 
since for k = \ it holds true; a single eigenvector, being nonzero, 
constitutes a linearly independent system. Let 

bi(p — kibi, i = 1 , 2 , . . .. k 

and 

ki ^ki for i #7 

If there exists a linear dependence 

(Zjb, -f = 0 (9) 

where, for example, a, =7^0. then, applying the transformation 9 
to both sides of (9), we get 

OLikibi -f- a2?-.2^2 + • • • "T Of-kkhbh — 0 
Subtracting equation (9) multiplied by kh we get 
a, {ki - kh) bi + cca (k^ - X,.) bo t • ■ • + ^h-i {kh~i - kh) bu-i = 0 

which yields a nontrivial linear dependence between the vectors 6,, 

boy . . since ai (Xi — K) ¥= 0. 

We say that a linear transformation 9 of a real linear space 

has a simple spectrum if all its characteristic roots are real and di- 
stinct. Consequently, the transformation 9 has n distinct eigenva- 
lues and therefore, by the theorem just proved, the space lias a 
basis composed of the eigenvectors of this transformation. Thus, 
eny liiieor transformolioTi with a simple spectrum may be repfesented 

by a diagonal matrix. * *1 

Passing from the linear transformation to the matrix represen- 
ting it, we obtain the following result. 

Any matrix whose characteristic roots are all real and distinct is 
similar to a diagonal matrix, or we say that such a matrix can be re- 
duced to diagonal form (diagonalized). 
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EUCLIDEAN SPACES 


Definition of a Euclidean Space. 
Orlhonormal Bases 


The concept of an n-dimensional linear space does not by any 
means fully generalize the concept of a plane or three-dimensional 
Euclidean space: in the n-dimensional case, for n >• 3, neither 
tlie length of a vector nor the angle between vectors is defined and 
it is therefore impossible to develop the rich geometrical theory so 
familiar to the reader for « = 2 and n — 3. It turns out, however, 
that we can rectify the situation in the following manner. 

From analytic geometry we know that for two-dimensional 
(a plane) and three-dimensional space we can introduce the concept 
of scalar muItij)lication of vector.-j. It is defined by means of the 
lengths of the vectors and the angle between them; it appears, howe- 
ver. that both the length of a vector and the angle between vectors 
can. in turn, be expressed in terms of scalar products. We will 
therefore define the concept of scalar multiplication (we will define 
it axiomatically) for any //-dimcn.^ional linear space. This will be 
doin' with the aid of certain properties wliich we know the scalar 
multiplication of vectors in the plane or in three-dimensional space 
actually possesse.s. Considering tlie immediate reasons for this mate- 
rial being included in the course of higher algebra, we dispense 
with the df'linitions of tlie h'liL'th of a vector and the angle between 
vectors. The reader interested in the con.-^tiuclion of geometry in n- 
dimensional .“Spaces is reh'rred to llu* sjieci.il literature, in particu- 
lar. to more exhaustive texts on linear algebra. 

The reader sliould bear in mind that, with the exception of the 
end of this section, the whole chapter deals solely with real linear 


spaces. 

We shall say that scalar multiplication is defined in an /i-di- 
niensional real linear space I „ if to every pair of vectors fl, b there 
i.s associated a real number denoted by the symbol (a, h) and called 
\hQ scalar product of the vectors a and h. The following conditions 
are satisfied (here, a, b. c. are any vectors of the space and a 
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is any real number, or scalar): 

I. (a, b) = {b, a). 

II. {a ^ b, c) = (fl, c) ~ (b, c). 

III. (atf, b) = O' (a, b). 

IV. If a ^0, then the scalar square of the vcclor a is strictly 
positive 

(o, fl) > 0 

Note that from III \ve liave, for a = 0, the equation 

(0, ^) = 0 (I) 


which states that the scalar product of the zero vector by any vector b 
is zero- in particular, the scalar square of the zero vector is also zero. 

From II and III there immediately follows a formula for the 
scalar product of linear combinations of two systems of vectors: 





h t 

^ ctfpjfai, bj) 



If scalar multiplication is defined in an n-dimensional linear 
space then the space is termed n-dimensional Euclidean space. 

It is possible to define scalar multiplication in an n-dimensional 
linear space V„ for any n, which is to say that we can convert this space 

into a Euclidean space. 

Indeed, in V„ lake any basis c,, e. ,. . .. e... It 

« = S b = \ fifCj 

.= 1 i-i 


then put 




It is easy to sec that Conditions I-I V will be fulfilled, that is, equa- 
tion (1) defines scalar inuUiplicalion in the space Vn. 

Generally speaking, we svc that in n-dimensional linear space 
it is possible to specify scalar multiplication in many different 
ways. Naturally, definition (3) depends on the choice of the basis, 
but as yet we do not know whether it is possible to introduce scalar 
multiplication in any other fundamentally different manner or not. 
Our immediate purpose is to survey all possible modes of converting 
n-dimensional linear space into Euclidean .space and of e.stablishing 
the fact that in a certain sense there is only one n-dimensional Eu- 
clidean space for any n. , , 

Suppose we have an arbitrary n-dimensional Euclidean space £■„, 

which means that .scalar multiplication has been introduced in some 
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fashion into an ^-dimensional linear space. The vectors a and h 
are orthogonal if their scalar product is zero, 

(a, 6) = 0 

4 

From ( 1 ) it follows that the zero vector is orthogonal to any vector; 
however, there can be nonzero orthogonal vectors too. 

A set of vectors is called an orthogonal system if all the vectors 
are pairwise orthogonal. 

Every orthogonal system of nonzero vectors is linearly independent. 
Indeed, let there be a system of vectors Ci, a.,, . . a^ in En 
and lot Qi =j^ 0, i = 1. 2, .... k and 

(a^. aj) = 0 , i ( 4 )_ 

If 

ai^j a.a^ -r . . . -r cthfl/j = 0 

then by forming (he scalar product of both sides of this equation by 
the vector Aj, 1 ^ ^ k\ we get (by {!), ( 2 ) and ( 4 )] 

0 — (0, Qj) = (aiflj -r -r . . . — oj 

= (ffi, «i) - «« flj) - . . . -i- ((7ft, a,) 

= {Oi, fli) 

Whence, since {(7j, aJ > 0 by IV, it follows that = 0 , i — 1 , 

2 k% wliich is what we set out to prove. 

We now describe the orthogonalization process, which is a means 
of passing from any linearly independent system of k vectors 

(Zj, (7^, . . .« <7 ft (b) 

of Kiiclidean space £■„ to an orthogonal system, also consisting of k 
nonzero vectors. We denote these vectors by 6,. . . ., 6ft. 

Lei ns put 6, - o,, which is to say that the first vector of sy- 
steni (o) will enter into the orthogonal system we are building, 
that, put 

60 = CCj 6 , -f- (7., 

Since l>i = (7, and the vectors o, and a., are linearly independent, 
it foll(»ws that tlie vector 60 is different from zero for any scalar a|. 
We choose this .scalar remembering that the vector b. must be ortho- 
gonal to the vector 6j: 

0 = {hi, 6 n) - ( 6 ,. a, 6 , d- a.) =- a, ( 6 j, 6 j) -f ( 61 , (z.,) 

whence, by IV 

(6,. a.) 

Suppose an orthogonal system of nonzero vectors 61, 60, . . bl 
ha.s already been constructed; we also assume that for any i, 1 ^ 
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the vector bi is a linear combination of the vectors 

. . ., Oj. Then this assumption will also hold for some vector 6/+1 

if it is chosen in the form 

b(+i = ajbj + + . . . + <Xihi + 

The vector will then be different from zero, since system ( 5 ) 
is linearly independent and the vector ai+i does not enter into the 
notation of vectors bj, ^2’ • • •» choose the coefficients a,-, 

i = 1, 2, . . ., Z, from the fact that the vector 6/+, must be ortho- 
gonal to all the vectors bj, i = 1, 2 1 : 

0 = (bj, b/+i) = (bf, ajbi + agbo + . . . cc/bi + 

= Oil {bi, bi) + {bi, b.,) -r (b,-, b,) 

+ (^h «/+i) 

whence, since the vectors bj, bo, . . b/ are mutually orthogonaU 

a,- {bi, bi) + {bi, Oi^i) = 0 

or 

^ (^ 1 ' ,• « 9 / 

(X/ — /I j » > ^ *" ^ •••ft 

(Wl. Of) 

Continuing this process, we can construct the’ desired orthogonal 
system bj, bj, . . b>j. 

Applying the orlhogonalization process to an arbitrary basis 
of the space we obtain an orthogonal system of n nonzero vec- 
tors, that is to say, an orthogonal basis, since (as has been proved) 
this system is linearly independent. Now, using the remark made 
in connection with the first step of the process of orlhogonalization, 
and also taking into account the fact that any nonzero vector may 
be included in some basis of the space, we can even make the follo- 
wing assertion. 

Every Euclidean space possesses orthogonal bases, and any nonzero 
vector of this space enters into some orthogonal basis. 

In what follow.^?, an important role will be played by a special 
type of orthogonal basi.s. Basis of this kind correspond to the rectan- 
gular Carte.sian systems of coordinates used in analytic geometry. 

We shall call a vector b normalized if its scalar square is equal 
to unity 

{b, b) = 1 

If a 9^0, whence {a, a) > 0, then the transition to the vector 

b= J a 

ypi, (i) 
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is termed normalization of the vector a. The vector b is normalized 
since 

M — ( — ! — = (-— L=)‘'(a, a) = l 

’ \V(«. a) V(fl. 0) ' Vy(a. fl)/ 

A basis e,, e. for the Euclidean space is called or//w- 

normal if it is oHliogonal and all its vectors are normalized, that is, 

(^i, Cj) =0, i =7^7 

(fifi ^i) “ f» i = 1, 2, . - w 


( 6 ) 


Every Euclidean space has orthonormal bases. 

To prove this, it will suffice to take any orthogonal basis and 
to normalize all its vectors. The basis will remain orthogonal, since 
for any a and p it follows from (fl, b) — 0 that 

{aa, p/j) = ap (fl, fc) = 0 


A basis ei. e of a Euclidean space is orthonormal if 

and only if the scalar product of any two vectors of the space is equal 
to the sum of the products of the corresponding coordinates of the vectors 
in the indicated basis; that is, from 


n 


ti 


a = 


folloU'S 


Indeed, 


^ CijCj, b = 

V R 
.u Pj'j 

( 7 ) 

t=i 

j=l 


n 

(fl, = S «>• 

p. 

(8) 

i^i 



(0) hold for 

our basis, then 



n 


(fl, b)-^{y diCi, 

i=l 


n 

V 

r 


n 



j~l 


{ei, 




C.oiivc’rsely, if our ba.«is is .such that for any vectors a and b written 
in tills basis in the form (7), equation (8) holds true, then, taking 
fur a and b any two vectors a and ej in the basis, which are distinct 
or the same, we can derive (G) from (8). 

Comparing the result just obtained with the earlier given proof 
of the existence of ^-dimensional Euclidean spaces for any n, we can 
make the following assertion: if an arbitrary basis is chosen in an n- 
dimensional linear space then in I',, we can specify scalar multi- 
plication so that in the resulting Euclidean space the chosen basis will 
be one of the orthonormal bases. , 

Isomorphism of Euclidean spaces. Euclidean spaces E and E 
are termed isomorphic if wo can establish between the vectors of 
these spaces a one-to-one correspondence such that the following 
requirements are met. 
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2 


(1) The correspondence is an isomorphic correspondence be- 
tween E and E\ which are regarded as linear spaces (see Sec. 29). 

(2) In this correspondence the scalar product is preserved; in 
other words, if for the images of the vectors a and 6 in £ we liave 
the corresponding vectors a' and b' in E\ then 

{a, b) = {a\ b’) (9) 


From Condition (1) it follows immediately that isomorphic Eu- 
clidean spaces have one and the same dimension. We will prove the 

CODV6rS6 

Any Euclidean spaces E and E' having the same dimension n are 
isomorphic to each other. 

In the spaces E and E\ choose the orthonormal bases 
and, respectively, 

e[. Cj, . . Cn (11) 


If we associate every vector 

n 

= 2 

i=i 


in E with a vector ^ 

a' = 2 

in E\ having in the basis (11) the same coordinates as the vector a 
in the basis (10), we will obviously get an isomorphic correspondence 
between the linear spaces E and We will show that (9) holds as 

well: if 

i=i i— t 


then, by (8) luse the 
normal !1, 


fact that the bases (10) and (11) are ortho- 


(a, 6) = 2 ®tP* “ 

t=i 


It is natural not to consider isomorphic Euclidean spaces as 
distinct and so for every n there exists a unique n-dimensional Eu- 
clidean space in the same sense that for every n there exists a 
unique n-dimensional real linear space. 

The concepts and results of this section may be extended to 
the case of complex linear spaces in the following manner. A com- 
plex linear space is called a unitary space if scalar multiplication 
is given and (a, b) is, in general, a complex number. Axioms Il-lV 

14^086 



210 


CH. 8. EUCLIDEAN SPACES 


must hold true (note, in the statement of Axiom IV, that the scalar 
square of a nonzero vector is real and is strictly positive), and Axiom 
I is replaced by the axiom 

1' (a, b) = (6, a) 

where, as usual, the bar denotes the complex conjugate. 

Consequently, scalar multiplication will no longer be com- 
mutative. Still, an equation that is symmetric to Axiom II holds true, 

ir {a, 6 + c) = (a, b) (a, c) 

since 

{a, 6 4' c) = (b + c, a) = (6, a) (c, a) 

= [b- aj -i- (c, a) = {a, b) -j- {a. c) 

On the other hand 

lir (a, ab) = a (a, b) 

since _ 

(o, ab) — {ab, a) = a (6, a) = a (6, a) = a (a, b) 

The concepts of orthogonality and of an orthonormal system of 
vectors are carried over to the case of unitary spaces without any 
allt rations. As before, proof is given of the existence of orthonormal 
bases in any tinite-diinen.'iional unitary space. Here, however, if 
c,, Co, . . Cn is an orthonormal basis and the vectors a, b have the 
nolalioiJS (7) in this basis, then 

71 

(ff, b)=^^ a, Pi 

»=i 

The results of the other sections of this chapter can also be ex- 
tended from Euclidean to unitary spaces, but we will not do this 
and will refer the interested reader to special books on linear algebra. 


d5. Orthogonal Matrices, Orthogonal Transformations 

Let there be given a real linear transformation of n unknowns: 

n 

a-f ^ i=l, 2, ...,n (1) 

Denote the matrix of the transformation by Q. This transformation 
carries the sum of the squares of the unknowns n, j'n, . . that is 

the quadratic form xf -f a-s + • • • + -rri, which is the normal form 
of positive definite quadratic forms (see Sec. 28), into a certain qua- 
dratic form in the unknowns yi, ?/„. Quite accidentally, 

this new quadratic form may itself turn out to be a sum of the 
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squares of the unknowns yi, • • •» yn\ that is, we can have the 
equation 

xj -1- j:* + . . . + Jn = + ^? + • • • + i/n (2) 

which, after replacing the unknowns r,, Xj’ • • -i by their ex- 
pressions (1), becomes an identity. Tiie linear transformation of 
unknowns (1) which has this property (or, as we say, such as leaves 
the sum of the squares of the unknowns invariant) is called an ortho- 
gonal transformation of the unknoivns. Its matrix Q is an orthogonal 
matrix. 

There are many other definitions of an orthogonal transformation 
and an orthogonal matrix which are equivalent to those given above. 
We now give some of tliem that will he needed in the sequel. 

In Sec. 2G we gave a rule for the transformation of the matrix 
of a quadratic form under a linear tramsformalion of the unknowns. 
Applying it to our case and taking into account that the unit ma- 
trix E is the matrix of a quadratic form (being the sum of the squares 
of all tlie unknowns), we find that equation (2) is equivalent to the 
matrix equation 

Q'EQ = E 

that is, 

Q'Q = E (3) 

Whence 

<?' = Q-^ (4) 

and so the following equation holds true too: 

QQ' = E (5) 

Thus, by (4), an orthogonal matrix Q may be defined as a matrix 
for which the transpose Q' is equal to the inverse matrix Each one 
of the equations (3) and (5) can also be taken as a definition of an 
orthogonal matrix. 

Since the columns of Q' arc the rows of Q, it follows from (5) that 
the square matrix Q is orthogonal if and only if the sum of the squares 
of all elements of any one of its rows is equal to unity, and the sum of 
the products of the corresponding elements of any two distinct rows is 
zero. From (3) follows an analogous as.sertion for the columns of a 
matrix Q. 

Taking determinants in (3), we get (since \Q' \ = | Q |) 

Whence it follows that the determinant of an orthogonal matrix is 
equal to ±1. Thus any orthogonal transformation of unknowns is 
a nonsingular transformation. We cannot, quite naturally, assort the 
converse: also note that by far not every matrix with determi- 
nant ±1 is orthogonal. 


14 * 
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.1 matrix that is inverse to an orthogonal matrix will itself be ortho- 
gonal. Indeed, taking transposes in (4), we obtain 

((?-!)' = {Qy = Q = 

On the other hand, a product of orthogonal matrices is orthogonal. 
Indeed, if matrices Q and R are orthogonal, then, using (4), and 
also (6) of Sec. 26 and an analogous equation which is true for in- 
verses, we get 

{QRY = R'Q' = = iQR)-' 

In Sec. 37, use will be made of the following assertion. 

The change-of -basis matrix from an orthonormal basis of a Eucli- 
dean space to anif other of its orthonormal bases is orthogonal. 

In a space let there be given two orthonormal bases e^ 
and el, . . Cn with the change-of-basis matrix ^ 

e' — Qe 

Since the basis e is orthonormal, the scalar product of any two vectors 
(of any two vectors from the basis e' , for instance), is equal to the 
sum of the products of the corresponding coordinates of these vectors 
in the basis e. However, since basis e' is also orthonormal, the scalar 
square of each vector of e is equal to unity, and the scalar product 
of any two distinct vectors of e is equal to zero. Whence, for the 
rows of coordinates of the vectors of basis e in basis e (i.e., for the 
rows of matrix ^), follow the assertions which, as derived above 
from (5), are characteristic of an orthogonal matrix. 

Orthogonal transformations of Euclidean space. It will be 
well at this point to make a study of an interesting special type of 
linear transformations of Euclidean spaces, though such transfor- 
mations will not be used in the sequel. 

A linear transformation (p of a Euclidean space En is called an 
orthogonal transformation of that Euclidean space if it preserves the 
scalar square of every vector, that is, for any vector a, 

(fl(p, flfp) = {a, a) (6) 

From this we derive the following more general assertion, which 
quite naturally can also be taken as a definition of an orthogonal 
transformation. 

An orthogonal transformation q) o/ a Euclidean space preserves 
the scalar product of any two vectors a, b: 

(flip, h(p) = (a, b) 

indeed, by (6), 

((a 4- b) <f, (a ~ 6) qi) = (a -f- 6, a -r b) 


( 7 ) 
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However, 

((a + b) (p, (a + 6) (p) = (c<p + «(p + b^>) 

= (aq), aq;) -f (acp, 6<p) + (i'fp- "^p) + (6(p, ^xp), 

{a -r b, a + 6) = (a, a) + (a, b) + (6, a) -f (6, b) 

Whence, using (6) both for a and for 6, and taking into account the 
commutativity of scalar multiplication, we obtain 

2 (a{p, bq) ” 2 (a, b) 

and so (7) holds true. 

In an orthogonal transformation of a Euclidean space, the images 
of all vectors of any orthonormal basis themselves form an orthonormal 
basis. Conversely, if a linear transformation of a Euclidean space car- 
ries at least one orthonormal basis again into an orihonormal basis. 

then the transformation is orthogonal. 

Indeed, let (p bo an orthogonal transformation of the space En- 
and let Cl, k, - - .. c,. be an arbitrary orihonormal basis of this space. 
Due to (7), "there follow from the equations 

(cj. Cj) — 1, i = 1, 2, . . n, 

(Ci, ej) =0 for i 

the equations 

(cj(p. Cjq) = t = 1, 2, . . n 

c;(p) = 0, i =^j 

That is, the system of vectors cpp. Catp, • • ■, C;.<p proves to be ortho- 
gonal and normal; for this reason it is an orihonormal basis of the 

space En- ^ r .i T? 

Conversely, let a linear transformation (p of the space En carry 

the orthonormal basis d, Co c,. again into an orthonormal 

basis; that is, the system of vectors cq, Cjq, . . ., c^q is an orthonor- 

mal basis of the space E^. If 

n 

a = ^ ctjCi 
i -1 

is an arbitrary vector of the space E^, then 

n 

aq = ct/ (cjq) 

i=l 

The vector aq has the same coordinates in the basis cq as the vector a 
has in the basis e. However, both those bases are orihonormal, and 
for this reason the scalar square of any vector is equal to the sum 
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of the squares of its coordinates in any one of these bases. Thus 

n 

(a, a) = (a<p, aq))= 2 “i 

i=I 

Equation (6) indeed holds true. 

An orthogonal transformation of a Euclidean space in any ortho- 
normal basis is represented by an orthogonal matrix. Conversely^, if 
a linear transformation of a Euclidean space in at least one orthonormal 
basis is represented by an orthogonal matrixy then the transformation 
is orthogonal. 

Indeed, if the transformation q) is orthogonal, and the basis 
^ 2 - • - -T is orthonormal, then the system of vectors ejq), • • • 

. . .y €n^> will also be an orthonormal basis. The matrix A of the 
transformation cp in the basis e, 

e(p = Ae (8) 

will thus be the transition matrix from the orthonormal basis e to 
the orthonormal basis ecp, i.e. (as proved above), it will be orthogonal. 

Conversely, let a linear transformation q) be represented in an 
orthonormal ba.sis ej, e.,, by the orthogonal matrix A; 

then (8) holds. Since the basis e is orthonormal, the scalar product 
of any vectors (in particular, any vectors of the system eiq), 
Csfpj • • - t is equal to the sum of the products of the correspon- 
ding coordinates of these vectors in the basis e. Therefore, since ma- 
trix A is orthogonal, 

"-^1, t = 1, 2, . . Uy 

(Cjfp, p^q’) -^0 for i =^7 

That is to say, the .system c(p is itself an orthonormal basis for the 
space En. Wlience follows the orthogonality of the transformation qi. 

As the reader will recall from analytic geometry, of all the affine 
transformations of a plane that leave the coordinate origin fixed, 
rotations (combined perhaps with mirror reflections) are the only 
ones that preserve the scalar product of the vectors. Thus, orthogonal 
transformations of n-dimensional Euclidean space may be regarded 
as •‘rotations*' of this space. 

Obviously, one of the orthogonal transformations of Euclidean 
space is the identity transformation. On the other hand, the rela- 
tionship we have established between orthogonal transformations 
and orthogonal matrices, and also the relationship (presented in 
Sec. 31) between ojierations on linear transformations and on matrices, 
permit deriving, from familiar properties of orthogonal matrices, 
the following properties of orthogonal transformations of Euclidean 
space, which can be verified directly. 

Every orthogonal transformation is nonsingular and its inverse 
is also orthogonal. 

1 he product of any orthogonal transformations is orthogo?ial. 
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36. Symmetric Transformations 

A linear transformation cp of n-dimensional Euclidean space is 
called symmetric (or self-adjoint) if for any vectors a, b of tins space 
we have the equality 

(aq), b) = {a, 6q>) 1^) 


That is, in scalar multiplication the symbol of symmetric trans- 
formation may be carried from one factor to the other. 

Obvious instances of symmetric transformations are the iden- 
tity transformation e and the zero transformation o). A more gene- 
ral example is the linear transformation in which each vector is 

multiplied by a fixed scalar a, 

flcp = cca 

Indeed, in this case 

(arp, b) - (aa, b) = a (a, b) = (a, ab) = {a, b(p) 

The role of symmetric transformations is extremely great and 
calls for a detailed study. 

A sumrnetric transformation of a Euclidean space m any orthorwr- 
mal basis is represented by a symmetric matrix. Conversely, if a linear 
transformation of a Euclidean space is represented in at least one ortho- 
norrnal basis by a symmetric matrix, then the transformation is sym- 

^"^Tndecd let the symmetric transformation cp be represented in 

an orlhonormal basis e,. by the matrix A 

king into account that in an orthonormal basis the scalar product 
of two vectors is equal to the sum of the products of the correspon- 
ding coordinates of these vectors, we obtain 

n 


{Ci, e}(i>) = (ei, ajueh) = ajt 

h= 1 

That is, due to (1), 

aij = aji 

for all i and j. The matrix A is thus symmetric. 

Conversely let a linear transformation cp be represented in the 

orthonormal basis by the symmetric matrix A = (a,^), 

cii} = for oil i and j (2) 



n n 

b=^fiiei, c='^yjej 

1*1 j=i 
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are any vectors of the space, then 

= S Pi = S ( S Pi«i;) 

t=-i j=i t=i 

n n n 

c(p-= 2 S (S Yj«jf) ^i 

i=l t=l i=l 

Using the fact that the c-basis is orthonormal, we get 

n 

(fcT- = 2 j Pi^i;Y;i 

i, x—i 
n 

{b. -i{)= 2 Pnwi 

i. j*i 

By (2), the right sides of the latter equalities coincide, and therefore 

(6(p, c) = (6, c(p) 
wliicli coniplctrs the proof. 

The result oblainoil yields llie following property of symmetric 
trarisforinulions that can readily he verified directly. 

The sum of symmetric tninsformations and also the product of a sym- 
metric transformation by a scalar are aj^ain symmetric transformations. 
We now prove the following important theorem. 

.*1// characteristic roots of a symmetric transformation are real. 
Since the characteri.'^tic roots of any linear transformation co- 
incide willi the characteristic roofs (»f the matrix of this transforma- 
tion in any hasi.®. and a symmetric transformation is represented 
in orthonormal bases by synunelric matrices, it suffices to prove 
the following as.'^ert ion. 

.•I// the characteristic roots of o symmetric matrix are real. 

Let >-0 be a characteristic root (possibly complex) of the sym- 
metric matrix .1 = 

I *-i - I - 0 

■fiien the sy.stem of hoinogeneons linear equations with comple.\ 
coefficients 

n 

2 V'ii I U 2, .... n 

lias a zero determinant , which is to say, it has a nontrivial solution 
ft,, ('o Pn {geiieraily coinjilox). Thus, 

n 

2 ^^ijPj >-,.Pm 

i=i 


i- 1. 2, n 


( 3 ) 
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Multiplying both sides of each ith equation of (3) by a scalar p,. 
the conjugate of p/, and adding separately the left and right members 
of all the resulting equations, we get the equation 

s = ^0 S Pipi (4) 

i,i=l i=l 

The coefficient of Xo in (4) is a nonzero real number since it is 
the sum of nonnegative real numbers, of which at least one is strictly 
positive. The real nature of the number Xq will therefore be proved 
if we prove the real nature of the left-hand side of (4); to do this, 
it suffices to show that this complex number coincides with its con- 
jugate. Here, for the first time, we make use of the symmetric nature 
of the (real) matrix A. 

i. i-l 


n 


n 


i, J-I ».i=l 


i.j=I t. i=»l i. 3=1 


Note that the second last equality is obtained by a simple interchange 
in the summation indices: / is put in place of i, i in place of /. Hence, 
the theorem is proved. 

A linear transformation (p of the Euclidean space E^ is symmetric 
if and only if there exists in E^ an orthonormal basis composed of the 
eigenvectors of the transformation. 

In one direction, this assertion is almost obvious: if there exists 
in En an orthonornial basis Cj, Co* • • •» 

Cjfp = Xjfi, i = 1, 2, . , n 


then in the e-basis the transformation cp is represented by the diagonal 
matrix 



A diagonal matrix, however, is symmetric, and so the transforma- 
tion (p is represented in the orthonormal basis e by a symmetric ma- 
trix, hence it is symmetric. 

The basic inverse as.sertion of the theorem we prove by induction 
witli respect to the dimension n of the space Indeed, for n = 1, 
any linear transformation (p of Et invariably carries any vector into 
a proportional vector, whence it follows that any nonzero vector a 
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is an eigeiiveclox for (p (incidentally, it also follows that any linear 
transformation of the space £i is symmetric). Normalizing the vec- 
tor a, we obtain the desired orthonormal basis of the space Ey. 

Let the assertion of the theorem be proved for an (n — l)-di- 
mensionai Euclidean space and let a symmetric transformation (p 
he given in the space From the above-proved theorem follows 
the existence, under cp, of a real characteristic root Xq. Consequently, 
this number is an eigenvalue of the transformation 9. If a is an 
eigenvector of the transformation 9 corresponding to this eigenvalue, 
then any nonzero vector proportional to the vector a will (under (p) 
be an eigenvector corresponding to the same eigenvalue Xo, since 

(an) (p — a (acp) = a (Xofl) = Xq (aa) 

In particular, normalizing the vector a, we obtain a vector ej such 
that 

= XoCi, 

(t'l. = 1 

As was proved in Sec. 3'i, tlie nonzero vector ej may be included 
ill the orthogonal basis 

fij, • • *1 Cn (b) 

of the space A',,. Those vectors whose first coordinate in the basis (5) 
is zero, that is, vectors of the form a^e'^ -F • • • obviously 

constitute an {n — l)*diniensional linear subspace of tlie space 
wliich we will designate by L. It will even be an (n — l)-dimensio- 
nal Euclidean space, since a scalar product, being defined for all 
vectors in A’,,, is in particular defined for vectors in L and possesses 
nil the requisite projierties. 

Tlie subspace L ctuisists of all the vectors of £„ which are ortho- 
gonal to the vector tj. Indeed, if 

a - ciiCi r "• • • • t 


then, by the orthogonality of the basis (0) and the normalized charac- 
liT of the vector Ci, 




= i'i) 


• '^'2 « ■ . T" CCn (^1» 


Cn) = 


tlial is to say. (ci, ft) — 0 if and only if at = 0- 

If the vector a belongs to tlie sulisjuice L, i.e., (ci. a) = 0, then 
llie vector rnp too lies in L. Indeed, because of the symmetry of the 
Iransforniation tp, 

(ci, (Kj) -= ((-,tp, a) -- (Vn a) Xq (e,, a) = Xq-O = 0 

Dial is, the vector <7«p is orthogonal to ci and therefore lies in L. 
I'his properly of the subspace £, which is called its invariance under 
ihe I rnnsformation (p, enables us to consider ip (regarded solely with 



37. REDUCING A QUADRATIC FORM TO PRINCIPAL AXES 




respect to the vectors in L) as a linear transformation of this {n — 1)- 
dimensional Euclidean space. It will even be a symmetric transfor- 
mation of the space L, since equation (1), which holds for any vectors 
in will hold (as a particular case) for vectors lying in L. 

By virtue of the induction hypothesis, space L has an orthonormal 
basis consisting of the eigenvectors of the transformation q); denote 
it by e^, . . All these vectors are orthogonal to the vector ei, 

and so Cl, ej, . . is the desired orthonorraal basis of the space 
consisting of the eigenvectors of the transformation <f . The theorem 
is proved. 

37. Reducing a Quadratic Form to Principal Axes. 

Pairs of Forms 

Let us apply the last theorem of the preceding section to prove 
the following matrix theorem. 

For every symmetric matrix A it is possible to find an orthogonal 
matrix Q which diagonalizes matrix A, that is, the matrix Q~^AQ 
obtained by transforming matrix A by matrix Q will be diagonal. 

Let there be given a symmetric matrix A of order n. If c,, 
e-i, . . ., Cn is some orthonormal basis of an n-dimensional Eucli- 
dean space En, then matrix A represents in this basis a symmetric 
transformation cp. As has been proved, there is in an orthonormal 
basis /i, /j, . . /n made up of the eigenvectors of the transforma- 
tion (p. in this basis, q) is represented by the diagonal matrix B 
f.see Sec. 33). Then, by Sec. 31, 

B==Q-^AQ ( 1 ) 

where Q is the change-of-basis matrix from the /-basis to the c-basis, 

e = Qf (2) 

'I'his matrix, as a matrix for changing from one oithonormal basi.s 
to another similar basis, will be orthogonal (see Sec. 35). Tlie theorem 
is proved. 

Since the inverse of orthogonal matrix Q is equal to its trans- 
po.se, Q~^ = Q' , equation (1) may be rewritten as 

B = Q'AQ 

From Sec. 26, however, we know that such precisely is the trans- 
formation of the symmetric matrix A of a quadratic form subject 
to a linear transformation of the unknowns with the matrix Q. 
However, taking into account that a linear transformation of un- 
knowns with an orthogonal matrix is an orthogonal transformation 
(see Sec. 35) and that a quadratic form reduced to canonical form 
has a diagonal matrix, we arrive, on the basis of the preceding theo- 
rem, at the following theorem on the reduction of a real quadratic 
form to principal axes. 
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Every real quadratic form f (xi, Xn) cart be reduced t& 

canonical form by an orthogonal transformation of die 

Although there may be many different orthogonal transforma- 
tions of the unknowns which reduce the given quadratic form to 
canonical form, the canonical form itself is actually determined 

uniquely* ^ ^ j # /. 

No matter what the orthogonal transformation that reduces to ca- 
nonical form the quadratic form f (xi, Xo . . x„) with matrix A, the 

coefficients of this canonical form are the characteristic roots of me 

matrix A {counting multiplicities). , * 

Suppose an orthogonal transformation reducesjorm / to tne ca- 
nonical form) 


/ (x„ x., . . Xn) - pif/I -t- + • ■ ■ + Ihii'* 

'I'liis ortliogonal transformation preserves iheT’sum^of the square 
of the unknowns and so, if X is a new unknown, 


/ (-ri, x-, . 



Taking determinants of these quadratic forms and taking into ac- 
count that after completing the linear transformation the determi- 
nant of the qjiadralic form is multiplied by the square of the deter- 
minant of the transformation (sec Sec. 28), and the square of the 
(ietorminnnt of an orthogonal transformation is equal to unity (see 
Sec. 35), we get the equation 





n 


= (.Ui-M 




from which follows tlie assertion of the theorem. 

This result may be slated in malri.x form as well. 

A’o matter what the orthogonal matrix which diagonalizes the sym- 
mrtric matrix .-I, the principal diagonal of the resulting diagonal ma- 
trix will exlnbit the characteristic roots of the matrix A taken with 
their multiplicities. 

Finding the orthogonal Iransforinalion that reduces a quadratic 
form to principal axes. In certain problems it is not only necessary’ 
to know the canonical form to which a real quadratic form is re- 
duc('d by an orthogonal transformation, but also the orthogonal 
Iraiisformatiou itself which accomplishes the reduction. It would be 
rather difricult to find tliis traiisfonuaiion by using the principal- 
axis theorem so we shall point out a different way. Namely, all we 
need to know is how to find the orthogonal matrix Q which diagona- 
lizes the given symmetric matrix A, or, what is the same thing, to 
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find its inverse matrix Q~K By (2), this is the change-of-basis matrix 
from the e-basis to the /-basis; that is, its rows are coordinate rows 
(in the e-basis) of an orthonormal system of n eigenvectors of the 
symmetric transformation (p defined by the matrix A in the e-basis. 
It remains to find such a system of eigenvectors. 

Let Xo be any characteristic root of the matrix A and let its 
multiplicity be equal to ko- From Sec. 33 we know that the collection 
of coordinate rows of all eigenvectors of the transformation cp cor- 
responding to the eigenvalue Xo coincides with the set of nonzero 
solutions of the system of homogeneous linear equations 

{A - XqE) X = 0 (3) 


Here, the symmetric nature of the matrix A enables us to write A 
in place of A'. From the above-proved theorems on the existence of 
.an orthogonal matrix that diagonalizes the symmetric matrix A. 
and on the uniqueness of this diagonal form, it follows that for .sy- 
stem (3) it is at least possible to find ko linearly independent solu- 
tions. We seek such a system of solutions by the methods taken from 
Sec. 12, and then we orthogonalizc and normalize the resulting sy- 
stem in accord with Sec. 34. 

Taking in turn, for Xq, all tlie different characteristic roots of 
the symrnetric matrix A and noting that the sum of the multipli- 
cities of these roots is equal to n. we obtain a set of n eigenvectors 
of tlie transformation (p represented by their coordinates in the e- 
basis. To prove that this is the desired orthonornial system of eigen- 
vectors, it remains to prove the following lemma. 

The eigenvectors of the symmetric transformation fp which corres- 
pond to distinct eigenvalues are mutually orthogonal. 

Suppose that 

b(p = Xjt, c(p = XjC 


and Xi =7^X3. Since 

(fc(p, c) = (X16, c) = Xi [b, c), 
{b, ccp) = (6, Xoc) = X2 (6, c) 

it follows from 

(6(p, c) = {b, ccp) 

tliat 

{h, c) = Xj {b, c) 


or, because Xi 9^X3. 

{b, c) = 0 

which is what we set out to prove. 
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Example. Reduce to principal axes the quadratic form 
/ (x,, X2, I3, Xi) = 2x1X2 + 2 x,X3 — 2 xiXt — 2x0x3 + 2x3x4 + 2x3x4 

The matrix A of this form looks like 



Let us hiid it? characteristic polynomial: 

-X 1 1-1 

1 -X - 1 1 

-1 1 1 -X 


(X - 1)3 (X -i- 3) 


Tlni';, tlip matrix A has a triple characteristic root I and a simple characteristic 
root —3. Hence, wo can already write the canonical form to which the form / 
is reduced hy an orthogonai transformation: 

/ = '/I -r !/5 + !/3 — 3ys 

Let VIS liiid tin* orthogonal transformation that accomplishes this reduction. 
The sy-iteni of honK>genoous linear equations (3) becomes, for Xj, = 1, 

{ — Jl -• To + X 3 — X 4 = 0 , 
xi — xo — X 3 4 - T* = 0 , 

X, — xo — X 3 -r X* = 0 , 

— X| X; + T 3 — X; = 0 

The rank of this system is unity and so w’C can Dnd three linearly independent 
solutions for it. For oxainjvle, the vectors 

b, = ( 1. 1, 0, 0), 

h. = ( 1, 0. 1. 0). 

b3= (-1, 0, 0, 1) 

will he >tidi solutions. 

Orlhogonalizing this system of vectors, wo obtain the following system of 
vectors: 

C, = ^ = { 1 , I, 0 , 0 ), 

r2=-yd 'r-i'2=(y. “-T. I. 0 ) . 



On the other hand, the system of homogeneous linear equations (3) becomes, 
for Xo = —3, 

( 3 X| J2 -- X3 — X4 = 0 , 

I Xi “T 3x2 — X3 + X4 = 0 , 

I X, — X 2 — 3x3 -f" T 4 = 0, 

V — X, 4- ■- X 3 -)• 3X4 “ 0 
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This system lias rank 3. Its nontrivial solution is the vector 

c*= (1. -1, -i, 1) 

The system of vectors cj, C 2 , cz, is orthogonal. Normalizing it, we arrive 
at the orthonormal system of vectors 


!_ _j_ 1 

“ \ 2 V 3 ’ 2 y.r 2 V3 ’ 

'-/I _1 1 \ 

• \2 ’ 2 • 2 ’ 2i 



Thus, the form / is reduced to principal axes by the orthogonal transformatioii 





It is well to note that the choice of a system of linearly ijulepeniletU eigen- 
vectors corresponding to a multiple eigenvalue is extremely ambiguous, and 
so tluTc are many different orthogonal transformations which reduce the form / 
to canonical f(»rm. We found only one of lliem. 

Pairs of forms. Let there be a pair of real quadratic forms in n 
unknowns, / (x„ Xj, . . Xn) and ^ (x,. Xg, . . ., x,.). Does there 
exist a nonsiiigular linear transformation of the unknowns xi, 
Xj, . . •, x„ such that will simultaneously reduce bolli forms to ca- 
nonical form? 

In tlie general case, the answer is no. Let ns examine the pair 
of forms 

/ (^1’ ^ 2 ) S (■^1* ^ 2 ) 

Let there be a nonsingular linear transformation 

+' C|2f/2» 1 

^2 = + C22E/2 / 

which reduces Loth forms to canonical form. For f to bo reduced by 
transformation (4) to canonical form, one of the coefficients c,, 
c ,2 must he zero, otlierwise the term 2cnCi2y,i/2 would occur. Renum- 
bering, if necessary, the unknowns i/„ we can set c ,2 = 0 and 
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cij ^ 0. However, we now find that 

g (3:,, 3:^) = cuyi (c.ii/i -i- C22I/2) = CnC2tyi -r ^1^221/1^2 

Since the form g was also to become canonical, it follows that 
= 0, that is, c.. = 0, which, together with ci 3 = 0, contradicts 
the nonsingularity of the linear transformation (4). 

The situation is different if we assume that at least one of our 
forms, say g (x„ x„) is positive definite. Namely, the 

following theorem holds. , , 

If f and p form a pair of real quadratic forms in n unknowns, and the 

second one is positive definite, then there exists a nonsingular linear 

transformation which simultaneously reduces g to normal form ana / 

to canonical form. , « .-..r, 

For proof, first perform the nonsingular linear transformation 

of tho unknowns xt? x^i • • *1 

X = rr 

which reduces the positive definite form g to normal form, 


V ^ • o I 

g (Xl) Xoy • • M ^fl) yi ' 1/2 ' 


Un 


Tlien / will go into some form (p in new unknowns, 

/ (x,, ^2, . . Xn) — (p (1/1, i/2» • • •’ y^) 

Now perform an orthogonal transformation of the unknowns 

//ii IJzy - ■ -1 Z/n> ^ 


which reduces 9 to principal axes, 

9 iUu y2» - • •» Un) “ + . . . T 

this transformation (see definition in Sec. 35) carries the sum of the 
squares of the unknowns yj, y., . . y,, into the sum of the squares 

of llie unknowns 2^, z.,, . . z„. As a result we get 

/ (xi, X 2 , . . Xn) = Xisj + ^22^ 
g (Xi, Xo, . . ., x„) = 2 J -!' 2 “ + . . . + an 

'I’lial is, the linear transformation 

X - {TQ) Z 

is the required one. 


• Tliis condition is not of course necessary; Ihus, both the foms + 
- and xi — xl — xj now have canonical form, though none is post 

live definite. 



CHAPTER 9 


EVALUATING ROOTS 
OF POLYNOMIALS 


38. Equations of Second, Third, and Fourth De^rree 

The fundamcMilal llieorera proved in Sec. 23 establishes the exi- 
stence of n complex roots for any polynomial of decree ti with nume- 
rical coefficients. The proofs (both ours and any other existiii" proofs; 
do not however indicate any methods for finding these roots. They 
are thus pure “existence proofs* . The search for siicli methods began 
naturally in attempts to derive formulas similar to the one used in 
the solution of quadratic equations for the case of real coefficients so 
familiar from .‘school algebra. We will now show that this formula 
holds true for quadratic equations with complex coefficients as well, 
and that analogous formulas (though much more involved) can be 
derived for equations of the third and fourth degree. 

Quadratic equations. Suppose wc have a quadratic equation 

px q = 0 

with arbitrary complex coefficients, the leading coefficient may, 
without loss of generality, be considered equal to unity. This equation 
may be written as 

As wo know, it is possible to take the square root of the complex 
number ^ ^ without going outside the complex-number system. 

The two values of this root which differ in sign alone can be WTilten 

as± ^ Therefore, 

X -f y = it — 7 

That is, the roots of the given equation may be found via the usual 
formula 


15-1186 
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Example. Solve 

a- - 3 j + (3 - n = 0 
Using tlie formula derived above, we get 


' " 0 3 1 T / 


-3+4i 


Applying the methods of Sec. 19, we find 

y_3 + 4^ = ±(* + 20 

and therefore 

T, = 2 -• i, T. = 1 - t 

Cubic onualions. Ijnlike the ca?e of quadratic equalioiis. we have 
not had a procedure for solviu- cubic equations even in the ca.e of 
ro il coefricb'ntP. We will now derive a formula for cubic equations 
.-imilar In the fuvmula used for quadratic equations, 
assume from the start that the coeflicients can he any complex num- 
bers. 

Suppose we have the cubic equation 

,/ -- a;r - bn -\-c -- 0 (1) 

Willi ail.ilraiy complex coefficient". Heplncing in (1) the unknown 
V hv a new unlinown x related to y by tlie equation 

, -.r-4 ( 2 ) 

we gel an equation in tiie unknown) .r. wiiich, as can readily he veri- 
fied? does not contain the square of the unknown; that is, we nave 
an ef]ualjoii of the form 

■■ - px q =0 

If the roots of {:-!) nro fouml. then, by (2). we will gel the roots of the 
.^iven equMion (1) as well. (Hir job. therefore, is to learn to sohe 
The incomidele' cubic equation (3) with arbitrary complex coef- 

the fundamenlal theorem, equation {3) has three complex 
roots.' Let .ro he one of them. We introduce an auxiliary unknown n 
and consider tl)e polynomial 

/ (n) - - :ro» “ y 

11^^ coefiirienls are complex numbers and therefore it has two complex 
roots a and jL by Vieta^s formulas. 

a 4- 


- - 


r. 

3 
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Substituting expression (4) of the root xq into (3), we get 

(a + + p (a - P) g - 0 

or 

+ P* -r (3ap -t p) (a 4 p) - 0 

However, from (5) it follows llial .Sap + p = 0. and so we have 

0-3 P3 ^ _g 

On the other hand, from (5) it follows that 


Equations (0) and (7) show that the luiinheis o." . nd are roots 
of the quadratic equation 

z'-l (S) 


w'itli complex coeflicients. 
Solving (8), we get 



whence* 




We arrive at the following formula (Cardan's formula) which 
expres.ses the roots of equation (3) in terms of its coeflicients by means 
of radicals of index 2 and index 3: 



Since a cube root has tliree values in (he hold of complex num- 
bers, foiinuhis (h) yield three values for a and three for p. How’ever, 
when using (iardan’s formula, one cannot combine ju.«t any value 
of the root a with any value of the root p; for a given value of a 
we have to lake only that one of the three values of p which satis- 
fies condition (5). 

Eel 0.1 be any one of the three values of the root a. Then the two 
others way be obtained, as was proved in Sec. 19, by multijilying oi 
by the cube root.^s e and e* of unity: 


ao = a,e, 0.3 = ajc- 

Denote by p, tliat one of the tliree values of the root p which corres- 
ponds to the value Oi of tlie root a on the basis of (5). that is. aipj = 


• It is ininnitciial wliicli of the ronlsof (8) we take for a® and wliich ono 
for since a and p enter in symmetrical fashion into (0) and (7) and also into 
the expression (-1) for xn- 
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~ —Z. The two other values of p are 

^ h = h = PiE* 

Since, by = 1, 

a2P3 = c£ie-Pie“ = a,pje^ = ct,Pir= — 

it follows that the value of root a is associated with the value 
fj 3 of root P; similarly, to the value ag there corresponds the value pa- 
Thus, all three roots of equation (3) can be written as follows: 

xj — at -r Pn ') 

Xa = 0:2 + Ps = aie + Pie% | (10) 

xa == ag 4- P2 = aiE- + PjE J 

Ciibir oc[uations with real coefficients. Let us see what can be 
said aliout the roots of the reduced cubic equation 

^ px q = 0 ( 11 ) 

if its coefficients are real. It turns out that in this case the main role 

is played by the sign of the expression j + ^ * which in Cardan’s 

formula is utuler the square-root sign. Notice that the sign of this 
.‘xpression is the opposite of the sign of the expression 

/>= -.'ip“-27,2= -108 (-f + 4) 


whicli is called the discriminant of equation (11) (see Sec. 54, below). 
The sign of the discriminant will be used in subsequent statements. 

(i) Lot O < 0. In this case, there is a positive number under 
each of the square-root signs in Cardan's formula, and so each of the 
cube roots involves real numbers. However, a cube root of a real 
number has one real and two conjugate complex values. Let ai be 
I he real value of the root a; then the value pi of the root p, corres- 
ponding to a, on the basis of formula (5), will also be real because 
the number p is real. Thus, the root xi ■--= ai + Pi of equation (11) 
is real. We find the other two roots by replacing, in formulas (10) 
this .•section, the roots of unity e = ei and = 63 by their ex- 
pressions (7), Sec. 19: 


.Co - a.e 


p.e 


Va 


4 ) 


j-,=.a,t=4-P,8 = a, (-y-<^)+Pi (-T+‘ 
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Since the numbers ai and pj are real, these two roots turn out to be 
conjugate complex numbers, the coefficient of the imaginary part 
being different from zero; since ai ^ Pi, these numbers are the values 
of distinct cube roots. 

Thus, if Z) < 0, then equation (11) has one real and two conjugate 
complex roots. 

(2) Let Z)=0. Then 



Let tti be the real value of the root a; then Pi will also, by (5), be 
a real number, and a, = pi. Replacing, in formulas (10), pi by a, 
and using the obvious equality e -r e- = —1, we get 

Xx = 2a,, Xz = a, (e 4- e-) = —a,, x^ = a, (e'^ + e) = —a, 

Thus, if D = 0, then all roots of (11) are real and two of them are 
equal. 

(3) Finally, let D "> 0. Then in Cardan’s formula there is a ne- 
gative real number under the square root sign. Therefore, under the 
signs of the cube roots we have conjugate complex numbers. Thus, 
all the values of the roots a and p will now be complex numbers. 
However, there must be at least one real root among the roots of 
equation (11). Let this root be 

X, = ao Pn 

Since both the sum of the numbers a© and po and their product, equal 
to — 4 1 are real, it follows that the numbers a© aud po are conjugate 

O 

as roots of a quadratic equation with real coefficients. But then the 
numbers aoC and poe* aud likewise the numbers aoC* and poe are 
also conjugate, whence it follows that the roots of equation (11) 

X2 = ctoe + Poe% ^3 = “oe" + Pee 
are real numbers too. 

We thus .see that the three roots of (11) are real, and it is easy 
to show that they are all di.stinct, for otherwise the choice of a 
root x, might be accomplished so that we would get the equality x.j = 
= X3, whence 

cto (e — e“) =- po (e — t=) 

or ao = Po» which is clearly impossible. 

Thu.s, if Z) > 0, then equation (11) has three distinct real ruofs. 

The last case that we have just considered shows that Cardan’s 
formula is of slight practical value. Indeed, although for Z) > 0 
all roots of (11) w-ith real coefficients are real numbers, to find them 
using Cardan’s formula requires extracting the cube roots of com- 
plex numbers, which Is only possible if the numbers are represented 
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in trigonometric form. That i.s why there is no practical value in 
writing the rool.> ns radicals. Using methods that go beyond the 
scope of this book, wo could domon.'^trate that in the case at hand 
the roots of equation (11) cannot, in general, be expressed in terms 
of coefficients by means of radicals with real radicands. This case 
of the solution of (11) i.s called the irreducible case (not to be confu- 
sed with the irreducibility of polynomials). 


( 12 ) 


Kxatapio 1. Sulvo the equation 

The .•substitution ;/ = i — 1 reduces this oquatinn to 

- Gi — 9 = 0 
Here, p — — fi, q =; —9, and so 

'• ::7 ■ t ^ 

Tliat !-•<. «'(]iiation (12) bus oiu- real and two co:iju"ali‘ coiujde.x roots. Ity (9), 

3 rTi 7 •' 

.) I r . I • .A 


— . j-n f • . 

Tlii.-; inijdi«‘s ibat liu- rfi"ts of ibc civrii oqnatinn are the iiutnbors 

.. 1/^ 


lllTO. /- 


9 7 


*) 

i 

^ •>'1 '••• 

3 , Th 

• » 

» 1 

” “ 

/ 1 



inijdii's 

» ibut 

t 

_ • 

[de 2 . 

Solve 


12 . 1 

•1 



A i lli!. 


.'■'.I 


5 


'vN lienro .o' •». 


.r’ - I 2 j- •• IG t n 

and .'<o 

'i 

V ' •' 

•1 I 

2. \iid t liorel'i'Pc 


.1 


1. \iini|i!e '■> 
!b-ro. /' 


■oh 


r' - I;-- -’Mi - II 


id, ; a;. 1 


’t 


I 

.;7 


n 


I'hn-. t'.ar. !on;i i. t .lopi;,-.! lo mi.' l•'llU 1 tio 1 l if we r.nn.iiii in the 

• loinaii: of ic'al n'lnit'i . aiilonit'!! Ilio r.vU .iro 'iio real iiumhav's 2, 3. — 5. 

l)iiar(ic '-quaiion-;. I'ho .sululiou of tin* qii.irlic equation 

//' - aij^ bxj- - exj - d — 0 (13) 

•viili .irhilrarv cii;iHiio\ coeflicieiils roduces to a .solution of some 
iu.\.iliary cuhic l•uu.ltioll. This i,-; achieved by a procedure due to Fer- 
ra ri. 
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First, the substitution y = x-^ reduces equation (13) to the 

form 

X* 4 - px" (j.r ~ r = 0 (1^) 

The left member of tins equation is then identically transformed with 
the aid of the auxiliary parameter a: 

+ qx\r^ { ” + 4 + “ ) " -i- <71 T - a- - 2'j.x- - pa 


or 

^^-24.^ + cc)^ — [2ax2-(/.r + --pa -r-- J = 0 ( 15 ) 

Now choose ot so as to complete the square in the square brackets. 
This requires that it have one double root; in other words, we must 

have the equation 

,f- - A ■■ia (a-- fM- r 4 ) = 0 ( 1 'i) 

Equation (16) is a cubic equation in the unknown a with complex 
coefficients. As we know, this equation has three complex roots. 
Let ao he one of them; it is expres.'^ed, by Cardan’s formula, with 
the aid of radicals in terms of the coefficients of equation (16). that 
is, in terms of the coefficients of equation (11). 

Given this choice of value for a, the polynomial in the square 

brackets in (15) has the double root , and so equation (15) takes 

iCCo 

the form 

(x'l = 0 

Hence it decomposes into two quadratic equations: 

Z -+1 . 0 


(l<l 


Since wo passed from (14) to (17) by means of identity transfor- 
mations, the roots of (17) will servo as roots for equation (14) ns 
well. At the same time, it is easy to see that the roots of (14) are 
expressed in terms of coefficients by means of radicals. We will 
not write out the appropriate formulas because they are exceedingly 
unwieldy and of no practical use. Neither will we investigate sepa- 
rately the ca.SG when (14) has real coefficients. 

Remarks on higher-degree equations. Whereas the ancient Greeks 
knew the methods for solving quadratic equations, the above-des- 
cribed methods for solving cubic a fid quartic equations were disco- 
vered only in the 16th century. This was followed by almo.st three 
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centuries of unsuccessful attempts to find formulas expressing by 
radicals the roots of any quintic equation (an equation of the fifth 
degree with literal coefficients) in terms of its coefficients. These 
attempts ceased only after Abel demonstrated, in the 1820’s, that 
no such formulas can be found for nth-degree equations where 
n >5. 

This result of Abel’s however did not preclude the possibility 
that the roots of a concrete polynomial with numerical coefficients 
could, in some way, be e.xpressed in terms of the coefficients by 
some combination of radicals, or, as we usually say, that any equa- 
tion is solvable by radicals. In the 1830’s, Galois made a complete in- 
vestigation of the conditions under which a given equation is sol- 
vable by radicals. It turned out that for any n equal to or greater 
than 5 there arc nth-degree equations even with integral coefficients 
that are not solvable by radicals. Such, for instance, is the equation 

- /ix - 2 = 0 

The investigations of Galois e.xerlcd a decisive influence on the 
subsequent development of algebra, but they lie outside the scope 
of this text. 


39. Bounds of Roots 

Wo know that there is no method by which we can find the exact 
values of the roots of polynomials with numerical coefficients. Ne- 
vertheless, a vast range of problems in mechanics, physics and engi- 
neering at large reduce to the problem of the roots of polynomials, 
wliich at limes are of very high degree. This circumstance spurred 
nuiiK'roiis investigations to find ways of describing the roots of 
a polynomial wilii numerical coefficients without actually know- 
ing the roots. For example, studies have hecn made of the location 
of root.s in the complex plane (the conditions under which all roots 
lie willtin [lie unit circle, that is, are less than unity in absolute 
value, or the coiulitions prescribing all roots to lie in the left half- 
plane, that is, to have negative real parts, etc.). For polynomials 
witii real coefficients, nietliods have been elaborated for determi- 
ning the number of their real roots, for finding the bounds within 
which tlie.^e roots may be located, etc. Finally, much research has 
been done into methods of approximation of roots: in engineering 
situations, it is ordinarily enough to know only certain approxi- 
mate value.s of the roots to within a specified accuracy, and if, say, 
Die roots of a polynomial were even written as radicals, the latter 
would in any case bo rejilaced by their approximations. 

At one time, such studie.'^ constilulcii the basic content of higher 
algebra. W(‘ include liore only a very small portion of the pertinent 
results, ami taking into account the primary demands of applica- 
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tions we confine ourselves to the case of polynonuals witli real coef- 
ficients and real roots. In only a few instances will we go farther 
afield. We will consider the polynomial / (j) with real coefficients 
as a (continuous) real function of a real variable x and wherever 
advisable we will take advantage 
of the results and methods of ma- 
thematical analysis. 

A good way to begin the study 
of the real roots of a polynomial 
/ (x) with real coefficients is to exa- 
mine the graph of the polynomial: 
obviously, only the abscissas of the 
points of intersection of the graph and 
the x~axis are the real roots of the 
polynomial. 

To lake an example, let us con- 
sider the fifth-degree polynomial 

h (x) = X* -r 2x* — 5x® -j- 8x^ — 7x — 3 

On the basis of the results of Sec. 

24. we can assert the following 
concerning the roots of this polyno- 
mial: since its degree is odd, h {x) 
has at least one real root; but if 
the number of real roots is greater 
than unity, then it is equal to three 
or five, since complex roots are 
pairwise conjugate. 

An examination of the graph 
of the polynomial h (x) enables us 
to say a good deal more about the f'S- ^ 

roots. We construct the graph 

(Fig. 0; note that the scale on the x-axis is ten limes that on llie 
y-axis), taking only integral values of x and computing the corres- 
ponding values of h (x), say by the Horner method: 

X I /l ix) 

I 

Wi 
18 
-/i 
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^\■e that thv polynoniiol h (j-) has in any case three real roots— 
iIk- fi<!sitive root 7.1 and two negative roots and 73. 


■1 < a ^ < 2 , — 1 < < 0 , 



Urdinarily, the information on the (real) roots of a polynomial 
that wo get by examining the graph is very satisfactory in a practical 
sense. However, the doubt always remains as to whether we have 
indeed found all the roots. For instance, in the case at hand we did 
not .'show that to the right of x = 2 and to the left of x = — A there 
are im roots of the polynomial. What is more, since we only took 
integral values of x, we can assume that the graph wo constructed 
does not very accurately reflect the true behaviour of the function 
h (.r); it may not, say, lake into account the smaller fluctuations 
and .«o loses .some roots. 

'I'nio. we could have taken values down to 0.1 or 0.01, in addi- 
tion to liie integral values of x. But then the computations would 
havi’ been severely complicated and doubts would still remain. 
On the other bond, we could apply inathomat ical analysis to test 
the iuiiflion h (.r) for ma.xtina atid minima and thus compare our 
grapli with the true behaviour of the function; but this brings us to 
?he problem of tlto roots of the derivative /;' (j), which is the same 
kind uf problem we are dealing with right now. 

I tii need is evident for more sophislicaUnl procedures enabling 
us (i» hml the hounds within which lie the real roots of a polynomial 
with real coeflicients and to dotm-mine the nnmhor of the roots. We 
shell ^ .\ainine I In* problem of the hounds of real roots and leave 
the uii'stinii of tile number oJ roots to later sections. 

I'i. ■ proof of tile lemma on the modnlns of the higliest-degree 
lerin ^see Sec. 22^ already provides a certain hound for the absolute 
ve.iij " o,' 1 he roots of a polyiiomial. Indeed, .■celling /r ^ 1 in inequa- 
liU o’ . Sec. 2H. we lind that for 


X 


<>• 


1 ^ 


I "0 



wii. ;> (’ the leading coeiru’ii iil ami A is the ma.ximnm of the 
ab.'-o''.;.' v.iliirs of tile remaiiiing coeflieioiil.s;. the ab.«olute value 
III ti'.' iiiL; I h'sI - degree leriii ol the polynomial is greater than the 
iili.'oiul: value of the sum of all (he other terms, and so no value of 
.1 .^ali.^hes inequality (1) can he a root of this polynomial. 

' i r a p-'hiiK'hiial j (.r) ifilii nrhi frary numerical coej ficients, 

'h-: !rrihrr 1 t * , senes as an upper bniitid of the moduli {absolute 

I I 

rnlu ' ci all iis refits, real and complex. For llie case above of the 
paUiM ..lial h O). till.'' bound, since - 1. A - S, is the number 9. 
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However, this bound is usually too high, particularly if we are 
only interested in the hounds of the real roots. Wo now give cert.iin 
more precise melliod>'. It is well to boar in mind Ihoi if the bound.s 
are indicated within which the real roots of a polynomial are to bo 
found, this does not in the len.<t mean that such roots actually exist. 

Let us first dernon.slrate that it is sufficient to be able to find only 
the upper bound of the real roots of any polynomial. Lot there ho given 
a polynomial / (x) of degree n and let A’o he the upper bound of its 
positive roots. We consider the polynomials 

'P,0)=^"/ (7). 


?2 {X) = f{-X), 


C|3 0) = O/ (-1) 

and find the upper bounds of their positive roots. Suppose those are 

the numbers, respectively. A’,. A'o. A’ 3. Then the number — will be 

^ \ 

the lower bound of the positive roots of the polynomial f (j): if a i« 
a positive root of / (x), then |^will be a positive root of (p, (x) and 

from 7<A, follows a > 4 * • Similarly, the numbers — A^ and 

C4 *> I « 

— serve, respectively, as the upper and lower bounds of the negative 
roots of the polynomial f (a:). Thus, all positive roots of / [x) satisfy 
the inequalities ■^<x<iXo. all negative roots, the inequalities 

•' I * 


To determine the upper bound of the po.silive roots we can u.se 
the following method. Suppose we have the polynomial 

/ (x) = aox"' -L 0,. 

with real cocfiicients. and uq > 0 . Let a,,, k > 1 . he the first of the 
negative coefficients; if there were no suclt coefficients, then the 
polynomial / (x) could not havi* any positive roots at all. rinally, 
let B be the greatest of the absolute values of the negative coeffi- 
cient.s. Then ihf'. number 


serves as Ike upper bound for the positive roots of the polynomial f (2). 

Indeed, .setting 2 > 1 and replacing each of the coefficients 
«!, ^2, . ■ (ih-\ hy the uumber zero, and each of tlio coefliciont.s 

0*4 + 1. • • •. On by the number —B. we can only diminisli the 
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value of the polynomial, that is, 

/ (j) > OtiX" — B {x 


rn-h+1— I 

.n-k ...-hx4 = — 


or, because a:>l, 




/ (^) > ^ l«ox'- (X- 1) - 


B] 


If 


h' 




JL 

OQ 


then, since 




(2) 

( 3 ) 


the expression in square brackets in formula (2) will prove to be 
positive; thus, by (2). the value of / (x) will be strictly positive. 
Thus, the values of x which satisfy the inequality 13) cannot be roots 
of / (x), which is what we set out to prove. 

Taking the above-considered polynomial h (x), this method 
(since k — 2, 5 = 7) yields for the upper bound of the positive 

roots the number 1 + [ 7, which can be replaced by the nearest 
greater integer A. 

Of the many other methods of linding the upper bound of positive 
roots, we give Xeu'tnji's inelhod. It is more involved than the one w'e 
just gave above, but ordinarily it yield-s a very good result. 

Suppose we have a polynomial / (j) with real coefficients and 
positive leading coefficient Op. If, for x — c, the polynomial f (x) 
and all its surcessive derivatircs f (x). f (x). . . (x) take on 

positive values, then the number c serves as the upper bound of the posi' 
lii'e roots. 

True enough, by Taylor’s formula (see Sec. 23), 


r (^) 


/ (., ) = / (r) (X - r) r (c) -f (x - cY + . . . + (X- c)" 


/<n) (tf) 


n\ 


We see tliat if x > c, then on the right we get a strictly positive 
miiiiber. that is, such values of x cannot be the roots of / (x). 

When seeking the ai>propviale number c for a given polynomial 

/ (x). it is useful to do as follows, 'fhe derivative (.r) = nlflo l^* 

a positive number, and so the polynomial (x) is an increasing 

function of x. Consequently, there is a number Ci such that for 

X • Cj the derivative /'’‘"”(x) is positive. Whence it follows that 

fur X ' ■ Cl the derivative /' "-’(x) will be an increasing function of 

X and therefore there exists a number c^, Co c,. such that for 

X C; the derivative (x) i.s also positive. Continuing thus, we 

linallv arrive at the desired number c. 

% 


39. BOUNDS OP ROOTS 


237 


Applying Newton’s method to the polynomial h (j) considered 
above, we have 

(x) = + 2x* — 5:r® + 8x‘ — lx — 3, 

k' (z) = ^z* Sx^ — 15x^ + I6x — 7, 
k" (x) = 20x» -f- 24x2 - 30x + Ih, 
k" (x) = 60x2 ^ - 30, 

(x) = 120x + 48, 

(x) = 120 

It is easy to verify (say, by the Horner method) that all these 
polynomials are positive for x = 2. Thus the number 2 is the upper 
bound for the positive roots of the polynomial h (x). This result 
is much more exact than those obtained by other methods. 

To find a lower bound for the negative roots of polynomial h (x), 
let us consider the polynomial (pj (x) = — h (— x) *. Since 

cp 2 (x) = X® — 2x^ “ Sx* — 8x2 _ 3 ^ 

q' (x) = 5x® — 8x2 _ 15x3 _ — 7, 

(p; (x) = 20x2 - 24x2 - 30x - 16, 
cp"' (x) = 60x2 _ 48x _ 30, 

(pjv (x) = 120x - 48, 

(pV (X) - 120 

and all these polynomials are positive (as may readily be checked 
for X = 4), the number 4 serves as an upper bound for the positive 
roots of (p 2 (x), and so the number —4 will be a lower bound for the 
negative roots of h (x). 

Finally, let us consider the polynomials 

) = J- 7i< - SiM 5x^--2x-i, 

■ i) =Sx*-lx*-8x^-5x"--2x + l 

For them, again using the Newton method, we find the numbers 1 
and 4 as upper bounds for the positive roots and so the number 

- = 1 is the lower bound for the positive roots of k (x) and the 
number — ^ is the upper bound for the negative roots. 

• —ft (— x) in place of ft (— i) because Newton’s method is applicable 
only if the leading coefficient Is positive. This change of sign of course has no 
effect whatsoever on the roots of the polynomial <p2 (x). 
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Thus, the positive roots of k {.r) lie betv/een 1 and 2 and the nega- 
tive roots lie between the number.'^ — i and “-t- . This result is in 

verv "ood agreement with wliat wo found earlier when we exaniined 
tlie graph. 


'll). SluruTs Theoieru 

\\\- hnw tome to till- qm-.'lion of tlie number of real roots of a poly- 
uoniial /(j) with real eoefticienls. We will be interested both in the 
teta! numUr of real roots, and. separately, the number of positive 
,.ed till' nnniber of nef?alive rinils aed tbi total miinlKT of roots in 
tli(‘ itilir\al lieiv.ieii sjH'eilh (1 bounds fi and 1. Ihei’e are se\ei'ol 
i)i(‘lh'>ds ho' liiidinE: the exati number of rout.- and all of them are 
r n hilie I'.-i • uir ; ,he me.'! couvei'iii iit one is the Shiini tnf'thod. 

whieli 


> * •< * 
k . t 


,( \V U 


I let b’l us int'i'diKu i'. derniilioii that will be needi (i in the 
next ->i‘!ion w<':!. 

'^u)i| I'.'f' v.i' have a rinile ordered sequence of real nu^1bel^ 

■ )i ama >'1 Ivnni /ere. s.;y 

1, o. -2. 1. -d, _■!, -1. 1 (1) 

(be siuO'S -d' iln-r' tiumhers in snceession: 

( 2 ) 

W'i. 1 1 , a 1 liter; ai’i' b i ii r \ ; rbi 1 i ons <ii' ."ii! n in (L). \\ e llten say that 
ID III. p.i'di'i' «l iptei.e. 0 M !iv .11' h.'ur ’■■jr>a!ion}< in sign. ’I lie nuin- 
' , f V. 1 : 'L I •• e r'.i: e I. ' I i '.u:>e I * e<-nnled f"r any finite orde- 

-eipani'' of non/eio real nunihei's. 

\,,v.’ !‘ I. n-- ('"ti-i'Ief the piih'immial j {x) willi real cooflicionts: 
.\I- will a.-^.'niii" ih.i! / (./ ) du:'.- e-t li,i\-' mullii'le roots, for then we 
,,.•, 1(1 it I'-v its Tieal ‘-t eonuntm divisor atni its derivative. 

T!i'' I'lnili "I'li'iei! se(inei!<-e of noiiyero |udynomials with real coef- 
'leienls 

/ (.r) - fo (./■). fi tJ), 1-2 f-'h . . /s (‘0 

e, Med liie SUimr .v.vyneoce h’t llie polynomial / (ar) if the followinir 
iviiuifi ne !:l ' err ne t : 

(1) Snee. .'>ive pidymmii.!l> of (2) do not have common roots. 
;J) ’I tie last polx nomial. f. (a), dors not have real roots. 

tf is a real r-'ot nl‘ une uf tlir julermcdiate polynomials 
(j) ..I [in. 1 -r k — 1. tlieii (c() and f^ + i (k) have diffe- 

/O I )' 01 is it JViil ninl of i (.f) . then the prodncl / (a-) /, (.r) changes 
M"ii tr.Mii minus In plus when x incri'ase.« and passes througli the 
puint -'i:. 
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The question of wliellier every polynomial has a Sturm sequence 
will be considered later on. for the present let us suppose that / (r) 
does have such a sequence and let us show how it can be uso-l to 
find the number of real roots. 

If a real number c is not a root of the given polynomial / (/} 
and (3) is a Sturm sequence for this polynomial, then take the set of 

real numbers 

/ (c)) /i {c)> /2 (c)i • • ’i fs (^) 

delete all numbers equal to /oro and denote by IT (c) the niimher 
of variations in sign in the remaining sequence; we call 11’ fc) tfie 
number of larintions in gif’ll in ihe Sturm setjuenre (3) of jwiyr'-uiml 

f (j), r ” c.* 

'I'he following theorem holds. 

Sturm’s thoorein. If the real numbers a and Ik a < b. arc m.d 'It. 
rools of a puliiiwminl f (,/') which docs not have any n-u!- 
tiple roots. then ir((7) ]V{b) and the difjerence Il'i/M 

— \V{b) is equal to the number of real rools of f (/) in the interval / 

tween a and b. 

Thus, to determine the number of real roots of a polyrwoiii.il 
f {x) lying between a and h Irecall that / (j:) does not. by hypoi le sis. 
have 'multi[de rootsl. it suffices to estahli.-h the riduction in the 
number of variations of sign in the Sturm sequence of this pnlyno- 
mial when moving from a to b. 

To prove this tlieorem. let us see how tlie number If (a-1 varie.- 
with increasing x. So long as x.,as it increases, does not eiicounu r miy 
of the roots of the Sturm sequence (3). the signs of the polyiwmiials 
of the sequence do not cliange ami so the number IT (.r) n niaiiis 
unaltered. For this reason, and also because of Condition (2) o' ilip 
definition of a Sturm sequence, it remains for u.s to consider two 
cases: tiie pas«ige of x tlirough a root i)f one of the internitdiate puly- 
nomials />, (x). 1 ^ < .s — 1. and tl»e passage of x tlirough a root 

of tlie polynomial / (x) itself. 

Lot a he a root of the polynomial /;, (j), 1 ^ A* ^ s — 1. Tlien. 
by Condition (1), /* _i (a) and , (a) are different from zero. We 
can thus find a positive numlier 8. which may be very small, such 
that in the interval {a — e, a + e) the polynomials .i (x) and 
fh + i W therefore preserve constant sieiis: 

Condition (3) states that these signs are distinct. From this it fol- 
lows that each of the sequences of numbers 

fh-\ (a - e). fk - e), Ati (a - e) 


('0 


* Quite naturally, the variatioii.s in sign in the Sturm seipionci' of tho 
polynomial / (x) have nothing in common with the variation in sign of ihe 
polynomial / (x) itself, which variation occurs when x passes through a root 
of the polynomial. 
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and 

fh-i {« + e)'. fk (a + e), fk+i (a + e) (5) 

has exactly one variation in sign, irrespective of the signs of the 
numbers (a — e) and (a + e). Thus, for instance, if the poly- 
nomial fk -1 (^) is negative on the interval in question and fk+x (x) is 
positive and if fk (a — e) > 0, fk (a + e)< 0, then the sequences 
(4) and (5) are associated with the sign sequences 

—I +» +; — * — t + 

Thus, when x passes through a root of one of the intermediate poly- 
nomials in Sturm’s sequence, the variations in sign in the sequence 
can only shift position, but do not disappear or reappear, and so the 
number W (jt) does not change in such a transition. 

On the other hand, let a be a root of the given polynomial / (x). 
By Condition (1). a will not be a root of /j {x). Hence, there is a posi- 
tive number e such that the interval (a — e, a + e) does not contain 
any roots of the polynomial fi (x), and therefore /i (x) preserves its 
sign over this interval. If the sign is positive, then, by Condition 
( 4 ], the polynomial / (x) itself cliunges sign from minus to plus when 
r pas.ses through a, i.e.. / (a — e) < 0, / (a -j- e) >■ 0. Hence, to 
lliG number sequences 

/ (a — e), /i (a — r) and / {a + e), /, (cc + e) (6) 

lliere correspond the sign sequences 

— , -r and -r, -- 

Thus, the SUinn sequence loses one variation in .sign. But if the 
sign of /i (x) is negative on the interval {a — e, a -{- s), then again, 
by Condition (4), the polynomial / (x) cluuiges sign frofn plus to 
mituis as x passes tbrouirh a, i.e., / (a — e) > 0. / (a -f e) < 0. 
To tlu* number sequences (ti) there now correspond the sign sequences 

-t-, — and — , — 

riie Sturm sequence again loses one variation in sign. 

'riius, an X increases, the number W (x) changes only wk^n x passes 
through a root of the polynomial f (x), in this case it is diminished exact’ 
iy by unity. 

this obviously proves the Slurm theorem. To use it for finding 
the total number of real roots of a polynomial / (x), it is sufficient 
to take, for a. the lower limit of the negative roots, and for b, the 
upper limit of the positive roots. It is simpler however to do as fol- 
lows. By the lemma proved in Sec. 23 there exists a positive number 
;V, which may be very large, such that for ] x | > iV the signs of all 
polynomials of the Sturm sequence will coincide with the signs of 
their highest-degree terms. In other words, there exists a positive 
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value of the unknown x which is so large that the signs of the corres- 
ponding values of all the polynomials of the Sturm sequence coincide 
with the signs of their leading coefficients. This value of x, which 
need not be computed, can be denoted by oo. On the other hand, there 
exists a negative value of x which is so large in absolute value that 
the signs of the corresponding values of the polynomials of the 
Sturm sequence coincide with the signs of their leading coefficients 
for polynomials of even degree and are opposite to the signs of the 
leading coefficients for polynomials of odd degree. Let us agree 
to denote this value of x by — oo. In the interval {— oo, oo) we obvio- 
usly have all the real roots of all the polynomials of Sturm’s sequence 
and, in particular, all the real roots of the polynomial / (x). Applying 
the Sturm theorem to this interval, we find the number of the.se 
roots; application of the Sturm theorem to the intervals ( — 00. 0) 
and (0, oo) yields, respectively, the number of negative and the 
number of positive roots of the polynomial / (x). 

It remains to demonstrate that any polynomial f (x) with real 
coefficients and without multiple roots hes a Sturm sequence. Of a varie- 
ty of methods used for constructing such a sequence, we give the 
mo.st widely used one. Set /, (x) — /' (x). thus ensuring fullfilment 
of Condition (4) of the definition of a Sturm sequence. Indeed, if 
a is a real root of the polynomial / (x). then /' (a) ^ 0. If /' (ct) > 0, 
then /' (x) > 0 in the neighbourhood of the point a and therefore 
/ (x) changes sign from minus to plus when x passes through a; this is 
then also true for the product / (x) /, (x). Similar reasoning is like- 
wi.se valid for /' (a) < 0. Then divide / (x) by h (x) and take tlie 
remainder (with reversed sign) for /« (x): 

/ (^) = fi i^) qi (x) ~ h ix) 

(ieiierally, if the polynomials (x) and (x) have already been 
foutid, then /,,+i (x) will be the remainder after dividing /;,_i (x) 
by fh W taken with reversed sign: 

fk-i (x) = ih {x) qu - /ft+i (x) (7) 

This method differs from the Euclidean algorithm as applied 
to the polynomials / (x) and f (x) solely in the fact that the sign 
of the remainder is reversed every time, and the next division 
is performed by the remainder with reversed sign. Since such a varia- 
tion in sign is inessential when seeking the greatest common divisor, 
our process will terminate at some f, (x), which is the greatest 
common divisor of the polynomials / (x) and /' (x); since / (x) has 
no multiple roots (it is prime to /' (x)I-it will follow that /, (x) is 
actually some nonzero real number. * 

This implies that the sequence of polynomials we have construc- 
ted, 

/ (x) = /o (^). / (-r) = /i (x), /g (x) (x) 
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also satisfies Condition (2) of the definition of a Sturm sequence. 
To prove that Condition (1) is met, assume that the consecutive 
polynomials (.r) and /h+i (^) a common root a. Ihen by (/)» 
a will also be a root of the polynomial /,* _i (.r). Passing to the equa- 
tion 

fh-2 (4 ^ fh-i (4 Qh-i (^) - fh (^) 

we find that a is a root of (r) as well. Continuing, we find that 
a i^; a common root of / {.r) and /' (x). which is in confiict with our 
assuinplions. Finally, fulfillment of Condition (3) follows directly 
from equation (7); if /;< (a) = 0, then f}^.^ (a) — — fk + i (^)- 
Let us apply tlic Sturm method to the polynomial 

k (x) - ^ 8x2 _ 7 j. _ 3 

which we cou.'^idered in the preceding section. We will not make 
a preliminary check to see that h (x) does not have any multiple 
rool« because the method of constructing a Sturm .sequence as 
given’ above is a simultaneous check on the relative primality of the 

poivnomial and its derivative. , , t 

Let ns find a Sturm sequence for h (x) by using this method. In 
tlie division process, we will (in contrast to the Euclidean algorithm) 
multiply and divide only by arbitrary poailive numbers .‘^incp the 
signs of the remainders jday a fundamental role in the Sturm method. 
WV obtain the following seipienco: 

h (./■) - ^ 2x‘ - 5 x" - 8x2 _ 7, _ 3, 

hi (x) ^ ox* ■ ‘V — LV2 — lf)x — 7, 

/,, p,.) _ 1.50x2 172.r-(U, 

/,3 (.,) -/i(;/,x2 -f 11.3.5X - 1 - 723. 

/,. (.0 : . -32.ry.)U/i57x - 8,480,003, 

I'-A') ■ -1 

We determine the signs of the pidynomials of this sequence for 
r — 00 and x 00 ; to do Ibis, we (as indicated above) only 
examine tin' signs of tin* leading Ciudlicieiits and the degrees of tlie 
polytmmials. We get tin' following table: 



1 

h fA) 

1 

i 

1 : 1 ( V) 

''2 l-V) 

1 

/13 l.T> 

(.V) 

>>:, < v) 

Nunibor ot 
Vijri/itioiii in 
.sign 

" — 


1 

1 

i 

1 

1 

1 

1 




— CO 

1 

-U 

1 


1 

T* 





1 


1 

1 

1 



1 

00 

•r ' 

-r 

“t ‘ 


1 
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Thus, when x passes from — oo to c», the Sturm sequence loses 
three variations in sign and so the polynomial h (j) has exactly 
three real roots. It will be recalled that when we constructed the 
graph of this polynomial (in the preceding section) we did not lose 
a single root. 

Let us apply the Sturm method to a simpler polynomial: 


/ (x) = -f- 3x2 _ 1 


Let us find the number of its real roots and also the integral bounds 
within which each of the roots is located. We shall not construct 
the graph of this polynomial. 

The Sturm sequence associated with the polynomial / (x) is 

/ (x) = x^ + 3x2 — 1, 
fi (x) = 3x2 + 6x, 

/s (x) = 2x + 1, 

h (x) = 1 

Let us find the number of variations of sign in this sequence 
for X = — 00 and x = oo 



Hx) 

/i (a:) 

/2(*) 

/3 (3r) 

Number ot v.irla* 
tions in sf^^n 

— 00 


_L 


1 

3 

00 

+ 

+ 

-t- 


0 


Consequently, the polynomial / (x) has three real roots. For a more 
precise location of the roots, continue the above tabic: 



/ (.T) 

/I (X) 

/2 (X) 

fj (X) 

Number of vnrhv 
tlons Jri sign 

x^-3 

— 

1 

+ 

— 

+ 

3 

x=-2 

+ 

0 


1 

“T 

9 

H 

II 

1 




+ 

2 



0 

+ 

+ 

1 

x^i 

+ 

-h 

-h 

+ 

0 


16* 
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Thus, the Sturm sequence of the polynomial / (x) loses one varia- 
tion of sign each time x moves from —3 to —2, from —1 to 0 and 
from 0 to 1. The roots a,, a. and ag of this polynomial thus satisfy 
the inequalities 

— 3<a, <— 2, — l<a2<0, 0 < ag < 1 

41. Other Theorems on the Number of Real Roots 

The Sturm theorem completely resolves the question of the num- 
ber of real roots of a polynomial, but it has one essential defect and 
that is the cumbersome computations involved in constructing a 
Sturm sequence, as the reader could see after performing all the 
computations of the first example above. We now prove two theorems 
which do not yield the exact number of real roots but only bound 
the number from above. These theorems are employed after a graph 
has been used to bound the number of real roots from below and at 
times enable us to hnd the exact number of real roots without resor- 
ting to the Sturm method. 

Suppose we have an nth-degree polynomial / (x) with real coef- 
ficients; we a.^ssume it can have multiple roots. Let us consider a se- 
quence of its consecutive derivatives: 

/ u) = r (^), r u), r (x), . . (x), (x) (i) 

of which the last one is equal to the leading coefficient Oo of / (x) 
multiplied by n! and for this reason preserves sign at all times. If 
a real number c is not a root of any one of the polynomials of the 
sequence (1). tlien by 4^ (c) we denote the number of variations in 
sign in the ordered sequence of numbers 

/ (r), r (c), r (c), . . .. (c). /'”> (c) 

Tims, we can consider the integer-valued function S (x) defined 
for those values of x which do not make any one of the polynomials 
in (1) vanish. 

Let us .‘see how S (x) varies with increasing x. The number S (x) 
remains unchanged until x passes througli a root of one of the poly- 
nomials of (1). We thus have two cases to consider: the passage of x 
through a root of the jmlynomial / (.i) and the passage of x through 
a root of one of the derivatives f''" (x), 1 /r ^ n — 1. 

Let a he an /-fold root of the polynomial fix), / !> 1, i-O., 

/ (a) - r (a) - . . . = /*'-*' (a) - 0, /<'> (a) ^0 

Let a positive number e be so small that the interval (a — e. cc + e) 
does not contain any roots of the polynomials / (x), /' (x), . . . 
/u-i) different from a and does not contain any root of the 
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polynomial (x) either. We will prove that in the number sequence 

/ (a “ e), f (a - e)> • • •> (“ “ «)» /*'* - 0 

any two consecutive numbers have opposite signs, whereas all the 
numbers 

/ (a + e), f {a + e), . . (a + e), /('> (a + e) 

have the same sign. Since each one of the polynomials of (1) is a 
derivative of the preceding polynomial, all we have to prove is that 
if X passes through the root a of polynomial / (x), then, irrespective 
of the multiplicity of this root, / (x) and /' (x) had different signs 
prior to the passage and have coincident signs after the passage. 
If / (a — e)> 0, then / (x) diminishes on the interval (a — e, a), 
and so /' (a — e) < 0: but if / (a — e)< 0, then / (x) increases and 
so f (a — e) > 0. Hence in both ca.ses the signs differ. On the 
other hand, if / (a + e) > 0. then / (x) increases on the interval 

(a, a + e) and so /' (« + £)> 0; similarly, from / (a + e) < 0 

it follows that /' (a + «)< 0. Thus, after the passage through the 
root a, the signs of / (x) and /' (x) must coincide. 

From what has been proved it follows that when x passes through 
an i-fold root of the polynomial / (x) the sequence 

/ (x). r {X) (x). /(O (x) 

loses I variations in sign. 

Now let a be a root of the derivatives 

/(M (x), /''■+!> (x). . . /('*+'-»> (x), \ ^k^n-1, / > 1 

but not a root of (x) or of (x). By what has been proved 
above, the passage of x through a implies a loss in tlie sequence 

/t^) (x), (x) (x), /<H0 (x) 

of I variations in sign. True, this passage possibly creates a new 
variation in sign between (x) and (x); however, because 
I ^ 1, the number of variations in sign, w’hen x passes tlirough a in 
the sequence 

/(fe-i) (x), /(fe) (x), /(Hi) (x) /(HJ-i) (x), /<Hi) (x) 

either does not change or decreases. It can then decrease only by an 
even number since the polynomials (x) and /*''+'> (x) do not 
change sign when x passes through the value a. 

These results imply that if the numbers a and b, a <. b, are not 
rootsofany one of the polynomials of the sequence (1), then the number of 
real roots of the polynomial f (x) lying between a and b {each counted 
according to its multiplicity) is equal to the difjerence S (a) — S (6) 
or is less than this difjerence by an even number. 
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In order lo relax the restrictions imposed on the numbers a and 
h, let us introduce the following notations. Suppose the real number 
c is not a root of the polynomial / (x), though it may be a root of some 
of the other polynomials of the sequence (1). Denote by 5+ (c) the 
number of variations in sign in the number sequence 

/ (c), r (c), /" (c), . . (c), /<"' (c) (2) 

which is computed ns follows: if 

/>'•> (c) = Z."*!) (c) = . . . = /(HZ-l) (c) = 0 (3) 

but 




( 4 ) 


then we take it that (c), (c), . . (c) have the 

same sign as (c); this is obviously the same as deleting the zeros 
in a coii!»t of the number of variations of sign in the sequence (2). 
On the oilier hand, by S _ (c) we denote the number of variations of 
sign in the sequence (2). which is counted as follows: if conditions 
(3) and (4) hold, then we lake it that 0 < i ^ — 1, has 

the same sign as/' ”^*’ (c) if the difference I — i is even, and opposite 
sign if this difference is odd. 

[f we now desire to determine the number of real roots of the 
|iolynomial / (.r) hetween a and /?, a <; b, and a and b are not roots 
of / (.r) hul, jiossihly. are roots of the other polynomials of the se- 
quence (1). then we do as follows. Let e be so small that the interval 
(u, a 2f) does not contain any roots of / (x), or any roots (distinct 
from a) of the other polynomials of the sequence (i); on the other 
luind, lei q he so small lliat the interval {b - 2\], 6) also fails lo con- 
t.iin ail) roots ol / (x) and any roots (distinct from b) of the other 
])()lyiiomiiils of the sequence (1). riieii the number wo want of real 
roots of tii(‘ polynomial / (.r) w’ill be equal to the number of the real 
roots of lliis polynomial lietween a 4- e and b — ?]. that is, from 
wlna has been j. roved, it will be equal to the difference 5 (a + e) — 

~ (b — i|) or less llntii lliis difference bv an even number. Howe- 
ver, it is easv to see that 


{a ■ f) - i’x (a), S {b - q) = (b) 

lliis is juoof of llie following theorem. 

Budan-I'ourier theorem. If the real numbers a and b, a<ih, are 
nut the roots of a polynomial f (x) with real coefficients, then the number 
of real roots of this polynomial between a and b, each counted according 
lo its multiplicity, is equal to the difference S+ (a) — 5 _ (6) or is 
an even number less than this difference. 

Use the symbol oo to denote a positive value of the unknown x 
so large that tlie signs of the associated values of all the polynomials 
of the sequence (1) coincide with the signs of their leading coeffi- 
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ciciits. Since tliose coeflicients are. .sequentially, tlio numbers (in. 
nan. n {n — 1) floi • • whose signs coincide, it follows (bat 

S (oo) = (oo) — 0. On liie other hand, since 

/(O) - /' (0) /"(O) - 

r ( 0 ) - fl;.-33! ( 0 ) - 

where Aq- coefficients of the jmlynomial / { 2 ). then 

S+ (0) coincides with the number of variations in sign in the sequence 
of coefficients of / (x), zero coefficients being deleted. Thus, a))plying 
the Budan-Fourier theorem to the interval (0. 00 ) we arrive at the 
following theorem. 

Descarto.s’ theorem (Descartes’ rule of signs). The number of 
positive roots of a polynomial / (x). a root of multiplicity m 6c/ng 
counted as m roots, is equal to the number of variations in sign in the 
se(}uence of coefficients of this polynomial {zero coefficients are not 
counted) or is less by an even number. 

To determine the number of negative roots of the polynomial 
/ (x) it is obviou.sly sufficient to apply Descartes’ Ibeorem (0 (be 
polynomial / (— x). If none of the coefficients of / (x) is zero, (ben, 
obviously, changes of sign in the sequence of coefficients of the poly- 
nomial / (— x) will be a.ssocialed with preservation of signs in (he 
sequence of coefficients of the polynomial / (.r). and conversely, 'riui.':, 
if the polynomial f (x) does not have zero coefficients, then the number 
of its negative roots {counting multiplicities) is equal to the number of 
preservations of signs in the sequence of coefficients or is less by an even 
number. 

We give another proof of the Descartes theorem that does not 
rely on the Budan-Fourier theorem. We first prove llie following 
lemma. 

// c > 0, then the number of variations of sign in the sequence of 
coefficients of the polynomial f (x) is less than the number of variations 
of sign in the sequence of coefficients of the product (x ~ c) f (x) by an 
odd number. 

Indeed, enclosing in parentheses successive terms of the j^ame 
.sign, we can write the polynomial / (x), the leading coefficient Oq 
of which can be considered positive, as follows: 

/ (.r) = (u„x" -f . . . 

T ■ • • -\-{ — 1)’ . . . -j (.,)) 

Here, «o > 0. a, > 0, . . > 0, whereas bi, b^, . . ., b^ are 

jiositivo or zero, but is considered strictly positive, that is. if 
where t > 0, i.s the smallest power of the unknown x that enters into 
the ])olynomial / (x) with a nonzero coefficient. The parenthesis 

(fifoX" -p . . . -j- /j,X*U+ *) 
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may accidentally consist of a single term, namely, when A‘i 1 — n. 
An analogous remark is applicable to the other parentheses of formu- 
la (5). ^ 

Now write a polynomial equal to the product {x — c) / {x)^ we will 

single out only those terms which contain x to the powers n + i, 
ki 4-1, . . ks + 1, and t We obtain 

(x — c) / {x) = {oqX^*^ -h - . . ) — *+•••) 

-1- ...-c6s+i:r') (6) 


where a’i = + cbj, t = 1, 2, . . s, and therefore, since c > 0, 

all the a'i are strictly positive. Thus, there was one change of sign 
in the sequence of coefficients of the polynomial / (x) between the 
terms aox" and — flix^i (also between the terms — and aoX^% 
etc.), whereas in the polynomial (x •— c) / (x) there will either be 
one change of sign between the corresponding terms Oox^*^ and 
— (respectively between the terms —a[x and a.x''-'^*, etc.) 
or more changes (but always more by an even number). We are not 
interested in the exact places of these changes in sign. It may happen, 
for example, that the coefficient of x'-i*" in (6) is negative, like the 
coefficient —a\. and so there is no change of sign between these two 
successive coefficients; that is to say, the change in sign in the first 
parenthesis is located at some previous position. Now notice that 
the last parenthesis in (5) did not have any variation in sign, whereas 
the last parenthesis in (b) did have variations in sign— an odd num- 
ber of them: it suffices to note that the last nonzero coefficients of 
the polynomials / (x) and (x — c) / (x), that is. (— l)®bs+i and 
(—iy*^bs+ic have different signs. Thus, between / (x) and 
(j- _ c) f (x) the total number of variations of sign in the sequence 
of coefficients invariably increases and by an odd number (the sum 
of several terms, one of which is odd and the others even, will natu- 
rally be odd!). The lemma is proved. 

To prove Descartes’ theorem, denote all the positive roots of the 
polynomial / (x) by aj, . . ., aft. Then 

/ (.t) = (x — ttj) (x — ao) • • • — cca) <P {^) 


where (p (x) is a polynomial with real coefficient.s which now has 
no positive real roots. This implies that the fir.^^t and the last non- 
zero coefficients of tlie polynomial (p (x) are of the same sign, which 
moans that the sequence of coefficients of this polynomial contains 
an even number of variations of sign, .\pplying the above-proved 
lemma to the polynomials 

(p (x), (x — «() <p (x), (x — a,) (x — a,) (p (x), . . ., / (x) 

in .succession, we find that the number of variations of sign in the 
.sequence of coefficients increa.ses each time by an odd number, that 
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is to say, by unity plus an even number, and so the number of varia- 
tions of sign in the sequence of coefficients of the polynomial / (r) 
is greater than Jc by an even number. 

Let us apply the theorems of Descartes and Budan-Fourior to 
the earlier considered polynomial 

k(x) = -h 2x* - 5x=» -{~Sx^ -7x -3 


The number of variations of sign in the sequence of coefficients 
is three, and so by Descartes’ theorem, h (j:) can have three positive 
roots or one. On the other hand, h (x) has no zero coefficients, hut 
since the sequence of coefficients has two preservations of sign. 
h (x) either has two negative roots or none. We compare with tlie 
results obtained earlier with the aid of the graph and see that two 
is the exact number of negative roots of our polynomial. 

To determine exactly the number of positive roots, use the Biidan- 
Fourier theorem, applying it to the interval (1, oo), since in Sec. 3^ 
it was demonstrated that 1 serves as a lower bound to the positive 
roots of the polynomial k (x). The successive derivatives of h (x> 
were also written out in Sec. 39. Let us find their signs for x = 1 
and X = oo: 



h (.f) 

h' (X) 

ir (X) 

h*- (X) 

/.IV (X, 

ftV (.r) 

Number of 
variations in 

1 


1 


+ 

1 

+ 

1 

x= oo 

+ 

+ 

-f ' 

1 

+ 

+ 

+ 

0 


From this it follows that when x moves from 1 to oo the sequence 
of derivatives loses one change of sign, and so h (x) has exactly one 
positive root. 

In connection with this example, it should be noted that, gene- 
rally speaking, when seeking the number of real roots of a polyno- 
mial it is best to begin by constructing a graph and applying the 
theorems of Descartes and Budan-Fourier, and then only in extreme 
cases to go on to construct a Sturm sequence. 

The Descartes theorem admits of a certain refinement in the 
special case when we know beforehand that all the roots of the poly- 
nomial are real, as for instance in the case of the characteristic 
polynomial of a symmetric matrix. Namely, 

If all the roots of a polynomial f (x) are real, and the constant term 
is nonzero, then the number ki of positive roots of the polynomial is 
equal to the number s^ of variations in sign in the sequence of its coeffi- 
cients, and the number /cj of negative roots is equal to the number s, 
of variations in sign in the sequence of coefficients of the polynomial 

f (-^). 
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Indeed, under our assumptions, 

k., = n (7) 

M-hcM’o n is Ihe degree of the polynomial / (j). and, by Descartes’ 
llieoreni. 

A*1 ^ Sj, /O. ^ Sj (^) 

\\'e will prove that 

5, -f- ^ n 

\Ve will iirove it by induction with respect to n. since for n ~ 1, 
due lo ccy ^0. Oi ^0, only one of tlie polynomials 


/ (.r) oox 


/ (— J) 

lias a change of sign; that is, for this case, .s', s. - 1. Let formu- 
la he ])roved for polynomials wlio.>je degree is loss than n. If 

i 


\\ lien* I N 


I (.r) ■ a,..,u 

I. (i,_, r:=:(». we assMiiie 

{■' ) n,.-/ • • • 


a 


n 


a 


I hell 


/ (-•) 


; g (.0. /(-■'■) - I)" fO,./-' (-j) 

If .sj and i.’l are. respectively, the numbers of v.aialioiis in sign in 
the ."eipiences of coeflicienls (tf tiie |)olynoniials g (j) and g (— i:), 
thill, by the imluclion hy)Kdhesis (it is clear that / .> 1). 

f t ^ f 

.V, ' / 

li / ti 1. then the variation in sign in liie tirst place, i.e.. for 
/ (,(). between f/., and (/, n., _ • will oeiiir only in Ihe ease of one of 

the polynomials / (./■). / ( — ./). and so 

S, .V. .'■•j s'. 1 / I // 

I’.iil if I -- 2. llien variations of sign are possible in the tirst 

places of eaeh of llie pidynoniials / {./). / (— .c); however, in this case 
as ^\■eil. 

s, S, ■:: s; 2 / 2 is: {n - 2J 

(Comparing (7), (S) and (b). we .see that 


») 

i* *- 


tl 


A 


N 


/i\» 




Tile proof is eoin]»leU‘. 


Approxiinnlioii of 

Lho me I hods described in the preceding seel ions enable us to 
hohilc the real roots of a jiolynumial / {x) with real coefbcienls* 
lhal is to say, tlicy permit indicating for eacli rout the interval 
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containing it alone. If the interval is small enough, then any number 
in the interval may be taken as an approximation of the desired 
root. Thus, after it has been demonstrated by the Sturm method 
(or any other more efficient method) that there is only one root of 
the polynomial / (x) between the rational numbers a and b. the 
problem remains of narrowing this interval so that the new limits 
a‘ and b' possess a prescribed number of coincident first decimals. 
The desired root will thus be computed to the needed accuracy. 

There are many methods which per- 
mit us to speedily approximate the value f ^ 
of a root with any desired accuracy. We 
will describe two. They are simple theo- 
retically and general enough so that when 
used in conjunction they quickly yield 
resull.s. The methods we are about to 
describe can be applied not only to poly- 
nomials hut also to the broader classes 
of continuous functions. 

From here on we a.«siime that a is 
a simple root of a polynomial / (x), since 

we can always dispose of multiple roots, and that the root a is 
isolated between the limits a and b. « < a < b; this implies, for one 
tiling, that / (a) and / (b) have difterent signs. 

The method of linear interpolation (also called the method of 
false position or regula falsi). For an approximate value of tiie root 

cc we could take, .say, the half sum of the limits a and b, ^4^ , i.e.. 

the midpoint of the interval from a to h. It is more natural, however, 
to assume that the root is closer to that endpoint of the interval 
(a, b) which corresponds to tlie smallest absolute value of the poly- 
nomial. The method of linear interpolation consists in taking a 
number c for tlio approximate value of the root a, such that divides 
the interval (a, b) into parts proportional to the absolute values 
of the numbers / (a) and / (b); that is, 

c— q _ / (a) 

b~c / (/,) 

The sign of the right member is minus because / (a) and i (b) linve 
different signs. Whence 

^_ fc/(a)-a/ (6) 

/(a)-/(6) (1) 

Geometrically, as Fig. 10 indicates, the method of linear inter- 
polation consists in replacing the curve y = f {x) on the interval 
(a, b) by its chord connecting the points A (a, f (a)) and B (b f (b))- 
for the approximate value of the root a we take the abscissa of the 
point of intersection of the chord and the x-axis. 
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Newton’s method. Since al'is a simple root of 
4 (^\ it fnllftws that f (a) # 0. We also assume that / (a) = 5 ^ 
since otherwise the problem would reduce to computing the root of 
the nolvDomial /" (a:) of lower degree than / (x). We likewise assume 
that thi interval (a, b) does not contain roots of / {x) ^i^ent from 
a neither does it contain any root of the polynomial / (x) or t e 
Dolvnoraial /" (x).* Thus, as follows from mathematical analysis, 
the^curve y == / (x) is either monotonic increasing on the interval 





ti. b) or nioiiotonic decreasing; also, it is either convex up at all 
])oints of the interval or convex down at all points. Consequently, 
there are four cases (shown in Figs. 11 to 11) of the location of the 

curve on the interval (a. b). , 

Denote by Uo the endpoint a or b in which the .'^ign of / (x) coin- 
cides with the sign of /" (.r). Since / (o) and / {b) have different signs, 
and /" (x) prc.serves .‘<ign throughout the interval (a. b), such an flo 
can be indicated. In the cases given in Figs. 11 and 14, Uq = a, 
in the other two cases, Oo -= b. At the point of the curve y — f {x) 

* Tliere is usually no difficultv in narrowing the interval so that this 
condition is satisfied, since the methods given earlier permit establishing the- 
iiiinihor of roots of jiolynoinials and / (j) in any inter\'al. 
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with abscissa a®, that is, at the point with coordinates (oot / (^o))’ 
draw a tangent line to this curve and denote by d the abscissa of the 
intersection point of this tangent with the x-axis. Figs. 11 to 14 
show tliat the number d may be taken as an approximate value of 
the root a. Tlie Newton method thus consists in replacing the curve 
r/ = / {x) on the interval (a. b) by its tangent at one of the endpoints 
of the interval. The condition impo.sed on the choice of the point 
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is very essential. Fig. 15 shows that if this condition is not obser- 
ved, the intersection point of the tangent line and x-axis may not 
nt all give an approximation to the desired root. 

Let us derive a formula for finding the number d. We recall that 
the equation of the tangent to the curve y = f {x) at the point (uq. 
/ (ao)) may be written as 

1/ - / (^o) = f («o) (x - flo) 

Substituting the coordinates (d. 0) 
of the point of intersection of the 
tangent line with the x-axis, we 
get 

— / (flo) = /' (flo) {d — flo) 
whence 

‘‘""‘’“'^71^ Fig- 16 

If in Figs. 11-14 the reader connects A and B by chords, he will 
see that in all cases fke meOiods of linear interpolation and of Newton 
yield approximations to the true value of the root a from different sides. 
It is therefore advisable, if the interval (a, b) is such as required 
by Newton’s method, to combine the two methods. In this way we 
obtain much closer endpoints c and d for the root a. If the accuracy 
•of the approximation is not sufficient, apply both methods (see 
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Fig. 16) once again to .the interval, and so on. We can demonstrate 
that this process does indeed permit computing the root a to any- 
desired accuracy. 

Let us apply these methods to the polynomial 

k {x) = > + 2:r« - 5x® + 8x=* - 7:c - 3 

which we dealt with in preceding sections. 

We know that this polynomial has a simple root ai lying between 
1 and 2. We can say right off that these limits are too broad for tho 
methods of linear interpolation and of Newton, used only once each, 
to yield a decent result. However, let us employ them so as to have 
one example that does not require involved computations. 

.As we saw in Sec. 41, for x = 1 the derivatives V (x), h" (x), . . . 

. . }i^’ (x) receive positive values. This implies, on the basis of the 

results of Sec. 39, that the value x = 1 serves as an upper bound of 
the positive roots for h' (x) and also for k“ (x). Hence, the interval 
(1, 2) does not contain any roots of these derivatives and so we can 
apply the Newton method. Besides, h" (x) is positive everywhere- 
in the interval, and since 

h (1) = -4, h (2) = 39 


we have to take = 2. Seeing that h' (2) = 109, we get, by formu- 
la (2), 


On the other hand, formula (1) yields 


c = 




am!, con.'^equently, the root cii lies within the interval 

1.0!! <; ■< 1.G5 


This narrowing of tlio interval that we obtained is too slight 
to consider the result satisfactory'. We could of course apply our 
methods to tlie new interval, but it is more advi.'Jable from the very 
beginning to hml a .sufficiently small interval for ai, say to within 
0,1 or even 0.01. and only then apply the methods. Quite naturally, 
this at once makes all the computations very cumbersome, but in 
the solution of concrete problems requiring exact knowledge of the 
roots of a polynomial, this has to be done. 

Let us return to our polynomial h (x) and its root ai; note that 
all values of the polynomials given below are computed by the 
Horner method. Since 

h (1.3) = -0.13987, k (1.31) = 0.0662923851 
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it follows that 


1.3 < cti <c 1.31 

tiiat is, we have the value of the root a, to an accuracy of 0.01. i\ow 
lot us apply the method of linear interpolation to the new interval: 


1 .3 1 . ( - 0. 1 3987) -1.3. 0.06G29238o I 
— (1. 13987 — 0.0G629238.')1 


0.2G940980063 

U.2(K)IG23851 


= 1.30078. 


We also apply Newton’s method to this interval, setting = 
= 1.31. Since 

h’ (1.31) = 20.92822405 

it follows that 



Thus, 


O.Otir)29238.5I 

20.9282210.j 


27.349(J811204 
20.92822403 “ 


1.30G83 


1.30G78 < a, < 1.30684 


« 


and therefore, setting ai = 1.30081, we have an error of less than 
0.00003. 

We have not yet shown that the foregoing methods acliiallv 
permit computing a root to any desired accuracy, that is to .cay we 
have not proved the convergence of these methods. Let us do .co at 
least with re.specl to Newton’s method. 

As above, let the simple root a of the polynomial / (a) lie in the 
interval (a, b) chosen as required by the Newton method. For one 
thing, this implies the existence of positive numbers A and B such 
that everywhere on the interval (a, b), 

\nx)\>A. \riz)\<B (3) 

We introduce the notation 



and assume that 


C {b — a) C, i 



To fulfil this inequality it may be necessary to replace the interval 
(ff, b) of the root a by a smaller one; but this will not affect the vali- 
dity of inequalities (3). Let be the endpoint of the interval (a, b) 
at which Newton’s method is to be applied. On the basis of formula 
(2) we get a succession of approximate values of the root a: a,, a., 

. . ., Cft, . . ., lying in the interval (a, b) and related by the equali- 
ties 


aii = — 


/(gft-i) 
/' (OA-l) 


2 . 


Let 


a = A: = 0, 1, 2, . . . 


(b) 
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Then „ 

0 / (ct) = / (a/i) -J- /ift/' (ofe) + -^ / -r 

wiiere 0 < 0 < 1. Since /' (cft) #0 due to the condition imposed 
on the interval (a, fc), we get, taking into account (5) and (6), 


hi r(ah^^hk) 

2 r («h) 



IM 

/' iOh) 



fjOk) \ 

r M I 





Wlience 


^hi 


r(flft+9Ah) 

2/' (flh) 


<hk-^ = Chl /c = 0, 1, 2. 


oh+1 

1 lik^i 1 < < C^hU 1 < < - . . < ^5 

or, since | I = \ a — aoldb — a, 

\h,^i\<C-^[C{b-a)f*\ /c = 0, 1,2, ... (7) 

\Mionce, because of condition (4), it follows that the difference hk 
between the root a and its approximate value obtained by successive 
application of the Newton method tends to zero with increasing k. The 
proof is complete. 

Nolo that (7) gives an estimate of the error for the {k + l«)lh 
slop; this is essential if the Newton method is used by itself and 
not in conjunction with the method of linear interpolation. 

texts dealing with the theory of approximations give procedures 
with better arranged computations (that simplify their use) than 
tliosf we have given. Such courses also describe many other methods 
for approximating roots. These include the method of Lobachevsky 
{sometimes erroneously called the Graeffe method). This method 
enables one to tind at once the approximate values of all roots, inclu- 
ding complex roots, and does not require a preliminary isolation 
of I lie roots. However, the computations are extremely unwieldy. 
Underlying this method is the theory of symmetric polynomials, 
which we describe in Chapter 11 below. 


CHAPTER 10 


FIELDS 

AND POLYNOMIALS 


43. Number Rings and Fields 

In the earlier parts of tins book we have frequently been in a 
position where we investigated complex numbers or only real numbers 
witli the proviso that llm results obtained hold true if we restrict 
ourselves to the real numbers (or. correspondingly, that they carry 
over word-for-word to the case of any complex numbers). As a rule, 
in all those cases it might be noted that the theory would hold true 
completely if we confined ourselves solely within the scope of the 
rational numbers. The time has now' come to indicate tlie reasons 
for this parallelism and thus enable us to present the material 
(which follows) in its natural generality, that is to say, in accepted 
algebraic language. To do this, we introduce the concept of a field. 
and also the broader concept (which plays a subsidiary role in our 
course) of a ring. 

Evidently, the systems of all complex, real and rational numbers, 
like the system of all integers, have one property in common: they are 
all closed not only under addition and multiplication, but under sub- 
traction as well. This property of the enumerated number systems 
distinguishes them, say, from the system of positive integers or posi- 
tive real numbers. 

Any system of numbers, complex or (in the particular case) 
real, containing a sum, a difference and a product of any tw'o of its 
numbers is termed a number ring. Thus, the systems of all integers, 
and of rational, real and complex numbers are number rings. On 
the other hand, no system of positive numbers is a ring since if a 
and b are two different numbers, then either a — b. or b — a is 
negative. Neither is a system of negative numbers a ring because 
the product of two negative numbers is positive. 

The four examples given above do not by any means exhaust 
the range of number rings. A few more instances will now’ be given; 
each time it is left to the reader to verify that the number system 
is indeed a ring. 
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The even numbers form a ring; generally, for any natural number 
n the collection of integers exactly divisible by n is a ring. The 
odd numbers do not constitute a ring since the sum of two odd numbers 
is an even number. 

Another instance of a ring is the collection of rational numbers 
whose denominators, in lowest terms, are powers of 2. This collec- 
tion includes, for example, all integers, since when simplified their 
denominators are 1, that is, two to the power zero. In this example, 
in place of 2 we can of course take any prime number p. Generally, 
taking any (finite or infinite) set of prime numbers and considering 
the system of rational numbers whose simplified denominators are 
divisible only by primes belonging to the given set, we again get a 
ring. On the other hand, the collection of rational numbers whose 
simplified denominators are not divisible by the .square of any prime 
will not be a ring, since the indicated property of the numbers is not 
preserved in their multiplication. 

Let us now examine number ring.s that do not lie entirely in the 
ring of rational numbers. A collection of numbers of the form 

a + 6 1/2 (1) 

wliere a ami b are any rational numbers, is a ring; in particular, 
thm ring includes all rational numbers (for b = 0) and also the number 

V 2 itself (for « — 0, 6 — 1). We would al.«o have obtained a ring 
if we had confined ourselves to numbers of the form (1) with inte- 
gral coefficients^^, b. In these examides, we could of course have 
taketi ) or | 5. etc. iti place of [ 2. 

Tlie system of numbers of the form 


a -r b 1/2 


( 2 ) 


vitli rational (or rnily integral) coefficients w, 6 is not a ring because 

the product of 2 by itself cannot, as can easily be checked, be 
wiitten as (2).* However, the system of numbers of the form 

a 6 ^2 H- c ^/a (3) 


* Iiidrtnl, lot 


>'4:=fl + 6i'2 


( 2 ') 

where the nuiubors ci and h are rational. .Multii'lviii" Ijoth sides of this enuation 
by V2, we got 

2 


Substituting the expression (2') for 
tions) at the equation 


we arrive (after some obvious manipula- 


{,1-1 b-} Y2^2-ab 


(21 
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with arbitrary rational coefficients a. 6, c, is a ring, and this is 
also true if we confine ourselves to the case of integral coefficients. 

Let us now consider all real numbers obtainable by applying 
several times the operations of addition, multiplication and sub- 
traction to the familiar number pi (ji) and any rational numbers. 
These will be numbers that can be written as 

flo -t- + - . . + Ann" (4) 

where Oi, Co, . . -i are rational numbers, n ^0. Note that 
no number can have two distinct notations of the type (4), for other- 
wise, by taking their difference, we would find that the number n. 
satisfies some equation with rational coefficients; now methods of 
mathematical analysis tell us that actually n cannot satisfy any 
equation with rational coefficients, which is to say that ;t is trans- 
cendental. Incidentally, even without taking advantage of this 
result, that is, assuming that the notation of a number in tlie form 
(4) is unique, we can show that numbers like (4) constitute a ring. 

Another ring is the collection of numbers obtained from jt and 
rational numbers via operations of addition, multiplication, sub- 
traction and divi.sion applied several times. To prove tliis, lliere is 
no need to seek a particularly suitable notation for these numbers 
(though it may possibly be found). If the numbers a and p are obtai- 
ned from 71 and some rational numbers by the indicated operations, 
then quite naturally it will be true of the numbers a -f a — p, 

afi and also (for (4 # 0) of the number j . 

Finally, if wo take the collection of coraple.x numbers a -{- bi 
with arbitrary rational a, 6, wc get a ring; this will also be true 
if we confine ourselves to integral coefficients a, b. 

The examples given above do not give a full picture of the great, 
diversity of number rings. But wo will not now continue the list 
of examples and will examine one special and very important type 
of number ring. We of course know that in the systems of rational, 
real, and complex numbers, division (except by zero) is unlimited, 
whereasthesenumber.systemsare not closed under division of inte- 
gers. Up to now we paid but slight attention to this difference. Actu- 
ally, it is very es.«enlial and brings us to the following definition. 

A number ring is called a number field if it contains the quotient 
of any two of its numbers (the divisor is of course assumed to bo 

If a -t- 5* ^ 0, tlien 

^ a-\-bi 

which is inipos.sihlc .since the number on the right is rational. But if « ^2 =: 

0, then, by (2') we have 2 — 06 = 0. From these two equatioii .>5 follows the 
fart that = —2 which is again out of the question since the numher is 
rational. 
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different from zero). We can thus speak of the field of rational num- 
bers, the field of real numbers, the field of complex numbers, 
whereas the ring of integers does not constitute a field. 

Some of the earlier considered examples of number rings are 
actually fields. To begin with, notice that there do not exist number 
fields different from the field of rational numbers and entirely embed- 
ded in it (we do not consider the system of zero alone to be a field). 

Even the following more general assertion holds true. 

The field of rational numbers lies entirely within any number field. 

Indeed, let there be some number field, call it P, If a is any 
number of P different from zero, then P also contains the quotient 
of the division of a by itself, that is, the number 1. Adding unity 
to itself several times, we find that all the natural numbers lie in 
the field P. On the other hand, P must also contain the difference 
a — a, which is the number 0, and so P contains the result of sub- 
tracting any natural number from zero, which is to say. any negative 
integer. Finally, P contains the quotients of all integers, or, gene- 
rally, all rational numbers. 

The field of complex numbers contains many different fields, and 
llie field of rational numbers is only the smallest in it. Thus, the 
ring, considered above, of numbers like 

a + b]'2 (5) 


with arbitrary rational (and not only integral) coefficients a, b is 
a field. To see this, consider the_quotient of two numbers of the 

funn (5), a + b \ 2 and c -f- d I 2; consider the ^cond number to 
be different from zero, hence the number c — d Y 2 is also nonzero, 
and so 


fl + M ^ {a-h-\/2){c-d-[/2) 

f+dVi; (c-i-J V-) (c-d V2) 


ac—2bd . be — ad 

c^-2d^ ^ ^ 


We again have a number of type (5), ai^d the coefficients remain 

rational. In this example, the number 1 2 may naturally be repla- 
ced by the square root of any rational number whose square root 
cnnn()t be taken in the field of rational numbers. Thus, the field is 
made up of numbers of the form a + bi with rational a, b. 


lliiigs 

In various divisions of mathematics, and also in applications 
of mathematics to science and engineering, one often has to perform 
algebraic operations with a variety of nonnumerical entities. The 
preceding chapters of this book afford numerous examples: the 
multiidication and addition of matrices, the addition of vectors, 
operations involving polynomials, operations on linear transforma- 
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tions. The general definition of an algebraic operation that is sati- 
sfied by the operations of addition and multiplication in number 
rings, and also by operations in the indicated examples, consists 
in the following. 

A set M is given that consists either of numbers or of objects 
of a geometrical nature, or. generally, of certain things which we 
will call elements of the set. We say that an algebraic operation is 
defined on the set M if a law is indicated according to which any 
two of elements a, 6 of the set are uniquely associated with some 
third element c which also belongs to M. This operation may be 
called addition, then c is termed the sum of the elements a and b and 
is denoted by the symbol c = a + h; the operation may be called 
multiplication, then c is the product of the elements a and 6, c = ab\ 
finally, it may be that a new terminology and symbolism will be 
introduced for an operation defined on M. 

In each of the number rings are defined two independent opera- 
tions, addition and multiplication. Subtraction and division will 
not be considered new operations since they are the inverses of addi- 
tion and multiplication if we accept the following general defini- 
tion of an inverse operation. 

Let an algebraic operation, say addition, be defined on the set 
M. Then we say that there is an inverse operation called subtraction 
if for any two of elements a, b of M there exists in M on element d 
that is unique and that satisfies the equation b + d = a. The ele- 
ment d is then called the difference between the elements a and b 
and is denoted by the symbol d — a — b. 

It is obvious that in number fields, both addition and multipli- 
cation have inverses. True, there is one restriction relative to multi- 
plication: the divisor must he different from zero. Now in number 
rings that are not fields (.say, in the ring of integers), only addition 
has an inverse operation. 

On the other hand, in the system of all polynomials in the un- 
known X, whoso coefficients belong to a fixed number field P, there 
are also defined two operations: addition and multiplication, addi- 
tion having the inverse operation of subtraction. 

As we know, both in number rings and in the system of polyno- 
mials, the operations of addition and multiplication have the follo- 
wing properties (a, 6, c are arbitrary numbers in the number ring 
under consideration or are arbitrary polynomials in the system at 
hand): 

I. Addition is commutative: a h = b a. 

II. Addition is as.sociative: a + (6 -f- c) = (a + 6) -{- c. 

III. Multiplication is commutative: ab = ba. 

IV. Multiplication is associative: a (be) = (ab) c. 

V. Multiplication is distributive over addition: 

(a -f 6) c = AC + be. 
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We are now prepared for a general definition of the concept of 
a ring, one of the most important concepts of algebra. 

A set jR is called a ring if on it are defined two operations: addi- 
tion and multiplication, both commutative and associative and also 
related by the distributive law, addition having the inverse opera- 
tion of subtraction. 

We thus have the following examples of rings: number rings, 
rings of polynomials in the unknown x with coefficients from the 
given number field or even from the given number ring. Let us take 
one more example which illustrates the breadth of the ring 
conce{)l. 

The course of mathematical analysis begins with a definition 
of a function of a real variable x. Let us consider the collection of 
functions that are defined for all real values of x and that take on 
real values; let us define algebraic operations in this collection as 
follows: the sum of two functions / (x) and g (x) is a function whose 
value for any x = xo is equal to the sum of the values of the given 
functions, that is, it is equal to / (xo) + g (xo). The product of these 
functions is a function whose value for every x = Xo is equal to the 
product / (xo) g (xo). For any two functions of the collection at hand, 
there obviously exists a sum and a product. The truth of Proper- 
ties I to V is verified without any difficulty. The addition and multi- 
plication of functions reduce to the addition and multiplication 
of their values for any x, whicli is to say, they reduce to operations 
on real numbers, for which the Properties I to V hold. Finally, ta- 
king for the difjerence of the functions / (x) and g (x) a function whose 
value for any x — xo is equal to the difference / (xq) — g (xq), we 
arrive at the operation of subtraction, the inverse of addition. This 
jiroves tliat the collection of functions defined for all real x becomes 
a ring as soon as ive introduce {as indicated above) the operations of 
addition and multiplication. 

Other examples of rings of functions may be obtained by conside- 
ring ollierwise defined functions, while pre.^^erving the definitions 
of operations on functions given above: functions defined, say, only 
for positive values of the unknown .t, or functions defined for values 
of X over llie interval [0. ll. Generally, a .‘system of all the functions 
Iniving some given domain of definition is a ring. We could also obtain 
rings hy regarding not all the functions defined in a given domain, 
but only tlie conlinnous functions studied in the course of mathema- 
tical aiialysi-s. On the other liand. we could consider the complex 
functions of a complex variable. Generally speaking, there are very 
many different function rings, just as there are a great diversity of 
number rings. 

Let ns now establish some of the more elementary properties 
of rings which follow directly from the definition of a ring. For 
numbers, those properties are quite ordinary, but the reader will 
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possibly be surprised to find that they are consequences only of the 
Conditions 1 to V and the existence of unique subtraction. 

First a few remarks regarding the significance of Conditions I 
to V. The role of the commuiatiue laws is evident enough. The signi- 
ficance of the associative laws consists in the following: the defini- 
tion of an algebraic operation speaks of the sum or product of only 
two elements. If we attempt to define the product of. say, three 
elements a, b, c, then we have the difficulty that the products au 
and VC. where be u, ab = v, may, generally speaking, not coin- 
cide, that is, a (be) (ab) c. The associative law demands that 
these products be equal to one and the same element of the ring: 
it is natural to take this clement for the product abc, written without 
brackets. What is more, the associative law permits (kfining uniquely 
the product (sum) of any finite number of elements of the ring\ that 
is, it permits proving that a product of any n elements is independent 
of the original arrangement of parenthese.s. 

Let us prove this assertion by means of induction with respect 
to the number n. It has already been proved for n = 3, and so let 
us assume n > 3 and also that for all numbers less than n our asser- 
tion has already been proved. Let there be elements Oj. a^ On 

and let there be some kind of arrangement of parentheses in this 
system indicating the order in which multiplication is to be perfor- 
med. The last step will be the multiplication of the product of the 
first k elements (where k ^ n — i) by the pro- 

duct aft+jOft+o . . . Or,. Since these products consist of a smaller, 
than n. number of factors and for this reason, by hypothesis, are 
uniquely defined, it remains to prove the following equation for 

any k and 1: 

. . . U/i) (ai. + iOh+i . . . fl/i) = ^ l) {^l + i^l + 2 • • • ®n) 

To do this, it will suffice to consider the ca.se f = A: -}- 1. Hut then, 
sotting 

... — by ~ ^ 

we get, by the associative law, 

b {a,,^iC) = (bok + i) c 

Which proves our assertion. 

We can speak, in particular, about the product of n equal ele- 
ments; tlial is, wo can introduce the concept of a power, a", of the 
element a with positive integral exponent n. It is easy to verify that 
all the ordinary rules for operating with exponents hold true in any 
ring. Analogously, the associative law of addition leads to the 
concept of a multiple, na, of the clement a by a positive integral 
coefficient n. 
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The distributive law, that is, the usual rule for removing brackets, 
is the only requirement in the definition of a ring that connects addi- 
tion and multiplication; it is only through this law that the joint 
study of the two indicated operations yields more than could be 
obtained in their separate study. The statement of the distributive 
law involves the sum of only two terms. However, it can readily 
be proved that the equality 

(Cj -f- ^2 “b • • ■ ^ “ ^2^ T • • • 

holds for any k and that the general rule of multiplication of a sum 
by a sum is true. 

Also, the distributive law holds true in any ritig for a difference as 
well. Indeed, by the definition of a difference, the element a — b 
sati.'^fies the equality 

b {a ~ b) = a 

Multiplying both sides of this equation by c and applying the distri- 
butive law to the left member, we get 

be {d — b) c = ac 

Klement (n — b) c is consequently the difference of the elements ac 
and be: 

{a — b) c — ae ~ be 

\’(‘ry iini)ortant jiroperties of ring.^ follow from the existence 
of subtraction. If a is an arbitrary elemejit of a ring R, then the 
(liffenmce o — a will be some quite definite element of the ring. 
Its role i.'^ similar to that of zero in number rings, but. by definition, 
it may dejieml ou the choic<‘ of the element a and therefore we will 
provisionally denote it by 0„. 

We will ])rove that actually the elements 0,, are equal for all a. 
Indeed, if h is some other arbitrary element of a ring R. then by 
adding the element 0^ to both .*=ide.s of the equation 

a ~ {b ~ a) = b 

aiul using the equation 0„ a = a, we get 

b 0„ a {b ~ a) = a -r {b — a) = b 

Thus. Oa = b — b = Oi,. 

\\'(‘ have pro\ed that any ring R pos.'icsses a uniquely defined ele- 
ment which when added to element a of that ring is a. We call this 
element tlio zero element of the ring R and we denote it by 0. We 
believe* there is no real danger of confusing it with the number zero. 
Tims, 


a 'P 0 = fl for all a in /? 
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To continue, in any ring there exists for any element a uniquely 
defined inverse element —a which satisfies the equation 

a 4- (—a) = 0 


Namely, this element is the difference 0 — < 2 ; the uniqueness follows 
from the uniqueness of subtraction. It is obvious that —{—a) = a. 
The difference b — « of any two elements of a ring may now he 
written as 


Indeed, 


b — a = b (— fl) 


16 + ( — 4* — 6 + l( — fl) 4'fll — 64"0 = 6 


For any element a of the ring and for any positive integer n ict 
have the equality 

n {~a) = -{na) 


And true enough, grouping the terms we get 

na 4 - n (—a) = n\a -{• (— a)l = «*0 — 0 


We are now in a position to define negative multiples of an ele- 
ment of a ring: if n > 0, then the equal elements n (—a) and ~{na) 
will be denoted by {—n) a. Let us finally agree to use the term zero 
multiple O a of any element a for the zero element of the ring under 
consideration. 

Wo have defined zero solely by means of the operation of addi- 
tion and its inverse, that is to say, without using multiplication. 
However, in the case of numbers, the number zero has a characte- 
ristic and very important properly with respect to multiplication 
too. It turns out that this property is possessed by the zero element 
of every ring: in any ring the product of any element by zero is zero. 
The proof rests directly on the distributive law: if a is an arbitrary 
element of a ring /?, then no matter what the auxiliary element x 
of this ring, we get 

a -0 = a {x — x) = ax — ax = 0 

Using this property of zero, we can prove that in any ring the 
following equality holds for any elements a, b: 

(—a) b = —ab 

True enough, 

ab 4- (—a) 6 = la 4- (— a)I 6 = 0-6 = 0 

Which implies that the familiar yet somewhat mysterious rule 
for the multiplication of negative numbers, “two negatives make 
a positive”, also follows from the definition of a ring, that is, in 
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any ring we have the equality 

(—a) (— fc) = ah 

Indeed, 

(—a) (—6) = —{a (—6)1 — —{—ah) ~ ah 

The reader will not find any difficulty now in proving that in any 
ring all the rules for operating with the multiples of any number hold 
true for the multiples (including negative multiples) of any element. 

Thus, the algebraic operations in an arbitrary ring have many 
of the familiar properties of operations on numbers. However, one 
should not think that every property of addition and multiplica- 
tion of numbers is preserved in any ring. For instance, the multi- 
plication of numbers has a property which is the converse of the 
one considered above: if a product of two numbers is equal to zero, 
tlien at least one of the factors is zero. This property cannot be 
carried over to all rings. In some rings we can find pairs of nonzero 
elements who.so product is equal to zero, that is, a ^ 0, 6 ^ 0, but 
ab = 0; elements a and b with this property are called divisors 

of zero. 

# 

Naturally, among the number rings one cannot find any instances 
of rings with zero divisors. Likewise there are no zero divisors among 
the rings of polynomials with numerical coefficients. However, many 
function rings have zero divisors. First of all, let us note that in any 
function ring a zero is a function equal to zero for all values of the 
variable x. Let us now construct the following functions f (x) and 
g (a) defined for all real values of x: 

f [x] ~ 0 for .r 0, / (:r) = x for x >■ 0, 

g (.r) = X for X ^ 0, g (x) = 0 for x >> 0 

Holli functions are nonzero since their values are not equal to zero 
for all values of x, but the product of those functions is zero. 

Not all the requirements I to V that enter into the definition 
of^a ring are nece.^sary in equal measure. The development of mathe- 
matic.s shows that whereas the properties I and II of addition and 
the distributive law V occur in all applications, the inclusion of 
the multiplication properties III and IV in the definition of a ring 
i.' loo coiit'ining and narrows the sphere of application of this con- 
ce))t. Tbu.^:. when the set of square matrices of order n with real 
elements is regarded with the operations of addition and multipli- 
cation of matrices, it .'iatisfies all the requirements in the definition 
of a ring, with the exception of the commutative law of multipli- 
ciition. .Noncommutative multiplications are encountered so often 
iiiid ir» .such important instances that the term ‘ring” is now usually 
interpreted to mean a noncommutative ring (or, more precisely, a not 
necessarily commutative ring, in the sense of possible noncommuta- 
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tivity of multiplication), and the special type of ring in which requi- 
rement III is fulfilled is termed a commutative ring. 

There has also been much interest recently in rings with nonas- 
sociative multiplication and the general theory of rings under con- 
struction is now a theory of nonassociative (that is to say, not neces- 
sarily associative) rings. An elementary instance of such a ring is 
the set of vectors of three-dimensional Euclidean space under the 
operations of the addition and (taken from the course of analytic 
geometry) the vector multiplication of vectors. 


45. Fields 

In the set of number rings, we singled out and gave the name 
number fields to those rings which admit division (except by zero). 
It is natural to do this in the general case as well. First note that 
no ring admits division by zero in virtue of the above-proved property 
of zero under multiplication: to divide an element a by zero means 
to find, in that ring, an element x such that O x = a. which for 
a 0 is impossible, since the left-hand side is equal to zero. 

Let us introduce the following definition. 

A ring P is termed a field if it consists of more than zero alone 
and if division can be performed uniquely in all cases except divi- 
sion by zero; that is to .say, for any elements a and b in P, b =^0, 
there is in P a unique element q which satisfies the equality bq = a. 
The element q is called the quotient of the elements a and b and is 

denoted by tlie symbol q = ~ 

Quite naturally, all number fields are instances of fields. A ring 
of polynomials in the unknown x with real coefficients and, gene- 
rally, with coefficients taken from some number field, is not a field. 
The division with a remainder that polynomials have differs of 
course from exact division, which is assumed in the definition of 
a field. On the other hand, it is easy to see that the set of all fractional 
rational functions with real coefficients (see Sec. 25) will be a field 
containing the ring of polynomials, just like the field of rational 
numbers contains the ring of integers. 

We could point to certain other instances of fields within the 
ring of functions, but instead we will examine examples of quite 
a different sort. 

All the number rings, and in general all the rings we have con- 
sidered so far, contain infinitely many elements. There are, bow’cver. 


• The uniqueness of division in a field, just like the assumed uniqueness 
of subtraction in the definition of a ring, can actually be proved without any 
difficulty by means of the requirements that enter into the delinition of a field 
(or ring). 
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rings and even fields consisting only of a finite number of elements. 
The simplest examples of finite rings and finite fields which are essen- 
tial objects in the theory of numbers are constructed in the follo- 
wing manner. 

Take any natural number n different from 1. The integers a and b 
arc called congruent modulo n, 

a = b (mod n) 

if these numbers yield the same remainder when divided by n, 
that is to say, if their difference is exactly divisible by n. The entire 
ring of integers is separated into n mutually exclusive (noninter- 
secting) classes 

Cat Cl, . . Cn_i (1) 

of numbers congruent modulo », the class C^, A: = 0, 1, ...» n — 1, 
consists of numbers which yield, upon division by n, the remainder 
k. It turns out that it is possible, in a very natural way, to define 
the addition and multiplication of these classes. 

For this purpose, let us take any (not necessarily distinct) classes 
Ch and Cl from the system (1). Adding any number of class C^ to 
any number of class C/, we obtain numbers lying in one very defi- 
nite class, namely, in the class Ch+i, if A H- / <! Wi or in the class 
Ch + i-n if A 4- / > This leads to the following definition of the 
addition of classes: 

Ck-^Ci = Ch + i for k-\-l<n, (2) 

Ch -f C/ = Ch + /_„ for k -r I n 

On the other hand, multiplying any number of class Ch by any 
number of class Ci we get numbers lying in a definite class, namely 
the class Cr, where r is the remainder left after dividing the product 
kl by n. We thus have the following definition of the multiplication 
of classes: 

Ch-Ci = Cr, where kl = nq r, 0 ^ r < n (3) 

The system (1) of classes of integers congruent modulo n is a ring 
with respect to the operations defined by the conditions (2) and (3). 
Indeed, the requirements I-V are readily seen to be valid from the 
definition of a ring, but this validity also follows from the truth of 
these requirements in the ring of integers and from the relationship, 
indicated above, between operations on integers and operations 
on classes. Zero is obviously the class Co consisting of numbers exactly 
divisible by n. The class opposite to Ch. A = 1, 2, . . n — 1, is 
the class Cn-u- In the system (1) of classes it is thus possible to 
define subtraction, that is, this system satisfies all the requirements 
of the definition of a ring. Let us agree to denote the resulting ring 
by Z„. 
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If the number n is a composite number, then the ring possesses 
zero divisors and therefore, as will be shown below, it cannot be a 
field. Indeed, if n = kl where 1 < A- < n, !</<«. then the classes 
€h and Ci are different from the zero class Cq. but by the definition 
of the multiplication of classes (see (3)1, Ck-Ci = Co- 
But if the number n is prime, then the ring Zn is a field. 

To see this, let there bo classes Ch and Cp,, Cq, i.e.. 1 ^ 

^ A ^ n — 1. We have to show that it is possible to divide Cm by 
Ck, or to find a class Ci such that C,rCi = Cm- If Cm = Co, then 
Cl = Cq as well. But if Cm ¥=■ Cq. then we consider the set of numbers 

A, 2k, 3k, . . (« — 1) A (4) 

All these numbers lie outside the zero class Cq, since the product of 
two natural numbers less than a prime n is not divisible by n. Also, 
no two numbers sk and tk from (4), s d t, can be in one class, for 
then their difference 

tk — sk = {t — s) k 

would be divisible by n , which again is in conflict with the primality 
of the number n. Thus every nonzero class contains exactly one num- 
ber from the set (4). For instance, in the class Cm there is the number 
Ik, where < n — 1, that is, Ci-Cu = Cm- and then class Ci 
will be the desired quotient resulting from the division of Cm by C/,. 

Wo have thus obtained an infinity of distinct finite fields: the 
field Zj, consisting of only two elements, and also the fields Z3. Z„. 
Z7, Zji and so on. 

Let us examine some properties of fields that follow from the 
existence of division. These properties are similar to those of rings 
based on the existence of subtraction and are demonstrated by the 
same arguments, and so the proof will be left to the reader. 

Every field P has a uniquely defined element whose product by any 
element a of the field is equal to a. This element, which coincides with 

equal quotients ^ for all nonzero a is called the unity (unit) element 

of the field P and is denoted by 1. Thus, 

<7-1 = a for all a in P 

For every nonzero element a, there is, in every field, a unique inverse 
element a~^ which satisfies the equality 

a-a~^ = 1 

namely, a“* = It is obvious that (a-^)“* a. The quotient — 
may now be written in the form ^ 



a 



270 


CH. 10. FIELDS AND POLYNOMIALS 


For any element a different from zero and for any positive inte- 
ger n we have the equality 

{a'T = K)-' 

Denoting these equal elements by a"", we arrive at negative powers 
of an element of the field for which the ordinary operating rules hold. 
Let us finally agree that a® = 1 for all a. 

Tlie existence of a unit element is not a characteristic property 
of fields: the ring of integers, for instance, has a unit element. "Vet 
the example of the ring of even numbers shows that not all rings pos- 
se.ss a unit element. On the other hand, any ring possessing a unit ele- 
ment and an inverse for every nonzero element is a field. Indeed, in 

this ca.*=e for the quotient , a ^ 0, we have the product 6a”h It 

is easy to prove the uniqueness of this quotient. 

Notice that no field has zero divisors. Let ab — 0, but a ^ 0. 
Multiplying both sides of the equality by the element a~\ we get 

1 b = b on the left and fl-' -O 0 on the right, or b = 0. 

From this it follows that in any field any equality may be divided by a 
common nonzero factor. This is so, since if ac ~ be and c ^ 0. then 
c - 0. whence a — h = U. or a — b. 

From the definition of the quotient ^ (where 6^0) and from 

till' above-proved possibility of writing it as the product ah"’, it is 
easy to see tlial all the ordinary rules for handling fractions hold true 
in any field, namely: 

-?- = 4- if and onlv if ad^bc, 

u a 

(I c fid zt 

T~~d^ bd~' 

<t c ac 

— (7 a 

b b 

The eharaelerisfie of a field. Not all properties of number fields 
hold true in the ca.'^e of arbitrary fields. Say, if we take 1 and add 1 
to it .several times, that is, if we take any positive integral multiple 
of one. we will never get zero, and, generally, all these multiples 
(that is. all natural numbers) are di.stincl. Ihil if we take integral 
muUiple.s of unity in some finite field, then there will invariably be 
eijual integral multiples, since the field has only a finite number of 
distinct elements. If all the integral multiples of unity of a field P 
are di.slinct elements of P. that is. A‘ l /- I for k ^ I then we say 
that the field P has characteristic zero. Such for example are all the 
number fields. Hut if there exist integers k and I such that A* >• I, 
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but in P we have the equality k i = M, then (Ar — /)•! = Q. 
i.e., there exists in P a positive multiple of unity which is equal 
to zero. In this case P is called a field of a finite characteristic, namely 
of the characteristic p, if p is the first positive coefficient with which 
the unit element of the field P vanishes. All finite fields are examples 
of fields of a finite characteristic. Incidentally, there also exist infi- 
nite fields having a finite characteristic. 

If a field P has a characteristic p, then the number p is prime. 

Indeed, from the equality p = st, where sdp, t<.p, would 
follow the equality {s-\) (/•!) = p i = 0, that is to say, since a 
field cannot have zero divisors, then either s i = 0 or M = 0, 
which, however, runs counter to the definition of a characteristic ns 
the least positive coefficient which makes the unit element of the 
field vanish. 

If the characteristic of a field P is equal to p, then for any element o 
of the field we have the equality pa = 0. But if the characteristic of 
the field P is 0 and a is an element of the field, n an integer, then from 
a 7 ^ 0 and n it follows that na 0. 

Indeed, in the first case the element pa (that is, the sum of p 
terms equal to a) cun, by factoring out a, be represented as 

pa — a (p*l) = a O = 0 

In the second case, from the equality na = 0, that is, a («1) = 0, 
we would get n i =0, a ^ 0; that is, since the characteristic of 
the field is zero, n — 0. 

Subfields, extensions. Suppose in the field P a portion of tlie 
elements (some set P’) is itself a field with respect to the operations 
defined in P\ that is to say, for any two elements a, b in P' , the 

elements (in the field P) a b, ah. a — b, and, for 6^0.-^ belong 

to P' (the laws I to V will of course hold in P' since they liold in P). 
Then P' is a subfield ol tlio field P, and P is an extension of tlie field P'. 
Quite naturally, the zero and unity of P will lie in P' ns well and 
will also serve in P’ as zero and unity. Thus, the field of rational 
numbers is a subfield of the field of real numbers; all number fields 
are .subfields of the field of complex numher.s. 

Let there be given in the field P a subfield P' and an element c 
exterior to P' and suppose we have a minimum subfield P" of I’ 
which contains both P' and c. There can only be one such minimum 
subfield, since if P" were one more subfield with these properties, 
then the intersection of subfields P” and P" (i.e., the collection 
of elements common (o both .subfields) would contain P' and the 
element c and, together with any tw'o of its elements, it would 
contain their sum (this sum must lie both in P'’ and in P"' , and so 
also in their intersection) and likewise their product, difierence and 
quotient; in other words the intersection would itself be a subfield. 
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but this contradicts the minimaiity of the suhfield P’. We will say 
that the field P“ is obtained by adjoining an element c to the field P , 

symbolically, contaiU, besides the elemen^t c and 

,11 tt elelnts of the field P', also all the elements which are 
derived from them by the operations of addition, multiplication 
subtraction and division. By way of illustration recall the ° 

(considered in Sec. 43) of the fiejd of rational numbers consistmg 

of numbers of the form o + f> I' 2 with rational a, h; this extension 
results from adjoining the number 1''2 to the field of rational num- 
hers. 

46. Isomorphisms of Rings (Fields). , v i ^ 

The Uniqueness of the Field of Complex Numbers 

The concept of an isomorphism plays an important role in the 
iheory of ring! Namely, the rings L and L' are called ^somm^ ucit 
a one-to-one correspondence can be set up between 
for any elements a, b in L and for 

in L', the sum a ^ b corresponds to the sum a ^b, and tne pro 

duct ab corresponds to the product a b . -intrc 

Suppose an isomorphic correspondence ® H , L 

zero 0' of V. Indeed, suppose the element 0 is f /t,(l 

menl c' of L' . Take an arbitrary element a of L and the associate 

element a' of L'. Then to the element a + 0 there has to 
the element «' -1- c'; but a 0 = a. and so a -r c - a, whence 
c' O'. Furthermore, the element -a ,s associated u- Pi 
-a' Indeed, let the element -a be associated \vith the element d . 
•Ihen to the element « - {-«) = 0 there will have to correspon 
I he element a' + d' , that is, a' - d' = 0' whence d = -a - Ihis 
implies that to a difference of elements in L there corresponds a difft 
]'Ze of V Irespiding elements of L' By 

may he shown that if the ring L has a unit element, then the ima e 
of this element (i.e., the element corresponding o V* ^ 
the given isomorphism) will he the unit clement of ’ 

and \[ the clement fl from I has the inverse q- , then in L the imag 

of'rt"^ is the inverse clement of a 

This implies that a ring isomorphic to a field is itself a 
is also easy to see that the property of a ring not to have zero dni- 
«ors also holds in an isomorphic correspondence Genera ly 
isomorphic rings can differ as to the nature of their elements, but 

they are identical with respect to their algebraic J 

theorem which lias been proved relative to some ring will hold true 
for all rings isomorphic to that ring, provided that the proof doe 
not involve any individual properties of the elements of the nng 
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but only the properties of the operations. For this reason ire will 
not consider isomorphic rings or fields to be distinct; for us they will 
simply be different copies of one and the same ring or field. 

Let us apply this concept to the problem of constructing the 
field of complex numbers. The construction, given in Sec. 17, of 
the field of complex numbers was based on the use of points in the 
plane. This is not the only possible construction. In place of points, 
we could have taken line segments (vectors) in the plane that emanate 
from the coordinate origin, and by specifying these vectors via their 
components a, b on the coordinate axes, we could have defined addi- 
tion and multiplication of the vectors with the aid of the same formu- 
las (2) and (3) of Sec. 17, as in the case of points in the plane. We 
could have gone further still and dispensed with geometrical mate- 
rial altogether; noting that points in a plane and also vectors in a 
plane can be represented by ordered pairs of real numbers (a. b). we 
could simply take the collection of all such pairs and introduce 
addition and multiplication via formulas (2) and (3) of that section. 

With respect to their algebraic properties, all the.«e fields would 
be indistingui-shahle, as witness the following theorem. 

All extensions of the field D of real numbers derived by adjoining 
to D a root of the equation 

+ 1 = 0 ( 1 ) 

are isomorphic among themselves. 

Indeed, suppose we have a field P which is an extension of the 
field D and contains an element satisfying equation (1). The choice 
of denoting this clement is up to us, and we use the letter i. We thus 
get the equation -j- 1 =0 (whence P = — 1). where involution 
and addition are to be understood in the sense of the operations 
defined in the field P. We now want to find the field D (/) obtained 
by adjoining the element i to the field D, that is. we wish to find the 
minimal subfield of the field P containing both D and the element i. 

For this purpose, let us examine all the elements a of the field 
P which can be written in the form 

a = a -p bi (2) 

where a and b are arbitrary real numbers, and the product of the 
number b by element i and the sum of the number a and this pro- 
duct arc to he understood in the sense of the operations defined in 
the field P. No element a oi P can possess two different repre.^enta- 
tions of that form: from 

a = a-\-bi=a-{-bi 
and b ^b there would follow 

a— a 
b—b 


18—986 
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That is, i would be a real number, but if ft = 6, then a = a. In 
particular, the elements of P written as (2) include all real numbers 
(the case ft = 0) and also the element i (the case a = 0, ft = 1). 

We will now show that the collection of all elements of type (2) 
constitutes a subfield of the field P. This will then be the desired 
field D (/). Suppose we have the elements a. = a bi and p = 
= c + di. Then, using the commutativity and associativity of addi- 
tion and the distributive law, all of which hold in P, we get 

a -H P = (a + ftj) + (c + ^0 = (a + c) + + ^0 

whence 

a P = (a “h c) "h (ft “T cQ * (3) 

Thus, this sum again belongs to the set of elements under considera- 
tion. Furthermore, 

-P = (-C) + i-d) i 

since, by (3), tlic equality P -1- ( — p) = 0 -}- Ot = 0 holds true. 
Tliereforc 

a — p = a -F {— p) — (a — c) + (ft — d) i (3') 

That is to say. this set is also closed under subtraction. Again using 
projierties from 1 to V. which hold for operations in the field P 
(.see Sec. i'l)' rcdying on the equality r = — 1, we get 
ap = (tt ft/) (c -f di) = ac + adi bci -f- bdi- 

ihal i.s, 

ap = {ac — bd) {ad -f- be) i (4) 

riius the in-oduct of any two elements of the type (2) is again an ele- 
ment of Ihi.s lyite. Finally, suppose that p ^0, i.e., at least one of 
the iiumhcrs cl d is nonzero. Tlien we will also have c — di =7^=0 and 

(c ; di) (c — di) = c~ — {di)- ^ c" — d'-i- = -f- 

and c- -h d- ^ 0. Therefore, using the assertion (stated in the pre- 
ceding seclion) that all the ordinary rules of handling fractions 
hold true in any field, and thus, in particular, that a fraction remains 
unchanged when the numerator and denominator are multiplied 
by llie same nonzero clement, we obtain 

cc a-i-hi (a~bi)(c~~di) (ar-^~/itI)-^(l>c—ad)i 

(c+ai)lc -di) c2-l-i/2 

That is to say, the element 

cc uc-rM I /»<-• — ad . 

■jT" c2^-d2"‘ ^ 

again has the form (2). 
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We will now show that the subfield D (i) which we have derived 
from the field P is isomorphic to that field of points in a plane that was 
constructed in Sec. 17. Associating with the element a hi of the 
field D (i) a point (a, b), we obtain (due to the uniqueness— just 
proved— of the notation (2) for elements of the field D {i)\ a one-to- 
one correspondence between the elements of this field and all the 
points in the plane. In this correspondence, the real number a is 
associated with the point {a. 0) because of the equality a = a Oi, 
and the element i = 0 1-z is associated with the point (0, 1). On 

the other hand, comparing formulas (3) and (4) of this section with 
formulas (2) and (3) of Sec. 17. we find that the sum and product of 
the elements a and [5 of the field D {i) are correlated with the points 
which are the sum and. respectively, the product of points associa- 
ted with the elements a and p. 

This completes the proof of the theorem, since all fields that are 

isomorphic to some given field are isomorphic among themselves. 
For one thing, we sec tliat the choice (in Sec. 17) of formulas (2) and 
(3) for determining operations involving points was not accidental 
and cannot be altered. 

Tliere are many other ways of con.-^trucling the field of complex 
numbers. Let us examine one which uses the addition and multi- 
plication of matrices. 

We consider a noncommutative ring of second-order matrices 
over the field of real numbers. It is obvious that the scalar matrices 



constitute in this ring a subfield that is isomorphic to the field of 
real numbers. It turns out. however, that in the ring of second-order 
matrices over the field of reals, we can also find a subfield that is isomor- 
phic to the field of complex numbers. Indeed, associate with every 
complex number a 4* bi the matrix 



In this way, the entire field of comple.v numbers is majiped one-to-one 
onto a part of the ring of second-order matrices, and from the equa- 
tions 

/ fl ^ a-]-c b-] d\ 

J ' l-d cj"A-(^^-rd) a4 




ac — bd ad be 
{ad-f be) ac — bd 


) 


18* 
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it follows that this mapping is isomorphic, since the matrices in 
the right-hand members correspond to the complex numbers 

(a c) (b -{■ d) i = (a + + ('^ + t 

+ 6c) i = (a -i- bi) {c + di). In particular, the role of the imagi- 
nary unit i is played by the matrix 

{^; :) 

The foregoing result indicates yet another possible way of con- 
structing the field of complex numbers that is just as satisfactory 
as those considered earlier. 

47. Linear Algebra and the Algebra of Polynomials 

over an Arbitrary Field 

In the earlier chapters of this book devoted to linear algebra, the 
base field was the field of real numbers. It is easy to verify, however, 
that much of what was written in those chapters can be carried over 
word for word to the case of an arbitrary base field. 

Thus, for an arbitrary base eld P, the Gaussian method for solving 
systems of linear equations, the theory of determinants and Cramer's 
rule, which were given in Chapter 1, all hold true. It is only the remark 
concerning skew-symmetric determinants (at the end of Sec. 4) 
which requires the assumption that the characteristic of the field 
P is different from two. Incidentally, the proof of Property 4 (same 
section) also breaks down if the characteristic of the field P is equal 
to two. though the property itself holds true. 

It is also useful to note that the assertion (mentioned repeatedly 
in Chapter 1) on the exi.slence of an infinity of distinct solutions to 
an indeterminate sy.stem of linear equations holds true in the case 
of any infinite base field P. but ceases to hold if P is finite. 

The following carry over completely to the case of an arbitrary 
base field: the theory of linear dependence of vectors, the theory of the 
rank of a matrix and the general theory of systems of linear equations 
{see Chapter 2). and also the algebra of matrices {Chapter 3). 

The general theory of quadratic forms constructed in Sec. 26 is 
carried over to the case of any base field P whose characteristic is different 
from two. As can be readily demonstrated, the fundamental theorem 
of this section ceases to hold without this restriction. 

For example, let P ^ Z... that is, let P be a field consisting of 
two elements 0 and 1; let 1 + 1 = 0, whence —1 ^ 1, and let there 
be a quadratic form / -- xix. over this field. If there exists a linear 

transformation 

■Tl = T bi2?/2* 
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which reduces / to canonical form, then in the equation 

/ “ "i" ^121/2) “f* ^22^/2) 

= b\\bn\y\ “h (^l|f^22 T ^^ 12 ^ 21 ) ~ ^12^22^2 

the coefficient 6iiZ>22 + ^12^21 product must be equal to 

zero. But this coefficient is equal to the determinant of the linear 
transformation that we took, since irrespective of whether 6,2651 = 1 
or 61262, = 0, we have 6,2601 = — 6,2601 in both cases. Our linear 
transformation turned out to be singular. 

The rest of Chapter 6 is largely devoted to quadratic forms with 
complex or real coefficients. 

Finally, ike entire theory of linear spaces and their linear trans- 
formations which was constructed in Chapter 7 holds true for the case 
of an arbitrary base field P. Incidentally, the concept of a characteri- 
stic root is connected with the theory of polynomials over an arbi- 
trary field (this will be discussed below). Notice that the theorem, 
in Sec. 33, on the relationship between characteristic roots and 
eigenvalues will now be formulated as follows: the characteristic 
roots of a linear transformation (p which lie in the base field and 
they alone, serve as the eigenvalues of this transformation. 

Now the theory of Euclidean spaces (Chapter 8) is essentially 
connected with the field of real numbers. 

We can also extend to the case of an arbitrary base field P certain 
of the above-discussed sections of the algebra of polynomials. Howe- 
ver, it is first necessary to make precise the meaning of the concept 
of a polynomial over an arbitrary field. 

In Sec. 20 we indicated two viewpoints concerning the concept 
of a polynomial: the formal-algebraic view and the function-theore- 
tic view. Both can be transferred to the case of an arbitrary base 
field. However, though they are equivalent in the case of niimber 
fields (.see Sec. 24), and, as can readily bo verified, of infinite fields 
in general, they cease to be equivalent in the case of finite fields. 

Consider, for instance, the field Z2 introduced in Sec. 4.5 and 
consisting of two elements 0 and 1 with 1 + 1 = 0. The polynomials 
x + i and + 1 with coefficients from this field are distinct; 
that is to say, they do not satisfy the algebraic definition of equality 
ef polynomials. Yet, for x = 0. both these polynomials become 1. 
and for X = 1 they have the value 0, that is to say, they must be 
considered equal as “functions" of the "variable" x, which takes on 
values in the field Zo. In the field Z3, consisting of three elements: 
0, 1, 2, with 1 + 2 — 0, the situation is the same relative to the 
polynomials x^ + x + 1 and 2x + 1. Examples of this type can, 
generally, be indicated for all finite fields. 

Thus, in the theory of an arbitrary field P, one cannot accept the 
function-theoretic view of polynomials. It consequently becomes 
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necessary to make explicit the formal-algebraic definition of a poly- 
nomial. For this purpose, we will construct a ring of polynomials 
over an arbitrary field P such that dispenses, from the very starri 
with the ordinary notation of polynomials in terms of an “unknown” 

X. 

Consider all possible ordered finite systems of elements of the 
field P having the form 

(®0i ■ • •> ®n-i) 

Here, n is arbitrary, n ^ 0, but for n > 0 it must be true that 
Qn = 7 ^ 0. Defining addition and multiplication for systems of the form 
(I) in accord with formulas (3) and (4), Sec. 20, we convert the col- 
lection of these systems into a commutative ring; the necessary proofs 
of the properties repeal word for word what was accomplished for 
number polynomials in Sec. 20. 

In the ring wo have constructed, systems of the form (a) (the 
case n ^ 0) constitute a subfield isomorphic to the field P. This 
permits identifying such systems with corresponding elements a 
of the field P, that is. setting 

(a) = a for all a in P (2) 

On the other hand, denote the system (0, 1) by the letter x, 

X - (0, 1) 

Tlien, applying the above-indicated definition of multiplication, we 
find tliat X' = (0, 0. 1) and, generally, 

(U, Cl 0, 1) (3) 

h titnci 


Now using the definitions of addition and multiplication of 
ordered systems, and also equalities (2) and (3), we get 

(Qq, Oj, Qn. . . ., Q/j-J. Qfi) 

- (flo) -r (0, fli) -i- (0. 0, flo) 

(0, 0, . . ., 0, fln-i) T (0, 0, . . ., 0, a„) 




(«ii) " («i) (0. 1) ■ 

rK-.)(0, 0 0, 1) 


n linies 

- (flo) (0, 0. 1) 

- (a,) (0, 0, . 


., 0 , 1 ) 


n— I 

” flo ^ 
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Thus, any ordered system of type (1) can be written as a j)o!y- 
nomial in x with coefficients from the field P , and this notation will 
evidently be unique. Finally, starting with the already proved com- 
mutativity of addition, we can go over to the notation in de.«cending 

powers of X. 

Consequently, we construct a commutative ring which it i.s na- 
tural to call a ring of polynomials in the unknown x over the field P. 

This ring is symbolized as P 1x1. 

The ring P 1x1 contains the field P itself, as was demonstrated 
above. Now, as in the case of rings of polynomials over number fields 
(see Sec. 20), the ring P Ixl has a unit element, does not have zero di- 
visors and is not a field. _ 

If the field P is contained in a greater field P. then the ring P 1x1 
is a subring of the ring P Ixl: any polynomial with coefficient5_from 

P can of course be considered a polynomial over the field P too; 
now the sum and jiroducl of polynomials depend solely on their 
coefficients, and for this reason they do not change when passing 
to a larger field. 

To get a still better picture of the true extent of the concept 
of a “ring of polynomials over a field P", let us examine it from yet 

another angle. 

Let the field P be contained as a subring in some commutative 
ring L. The element a of ring L is called algebraic over the field P 
if there exists an equation of degree n. n^ i. with coefticieiils from 
the field P that i.s .suti.sfied by the element a. If there is no such equa- 
tion, then the element a is termed transcendental over the field P. Natu- 
rally, the element x of the ring P (x) is transcendental over tlie 

field P. 

The following tlieorem holds true. 

If the element a of ring L is transcendental over the field P. then 
the subring V obtained by adjoining the element a to the field P (i.e., 
the minimal subring of the ring L containing the field P and the 
element a) is isomorphic to the ring P Ixl of polynomials. 

Indeed, any element p of the ring L which can be written as 

P = flow" + + . . . + n > 0 (4) 

with coefficients r/o, fli from the field P will be con- 

tained in the subring L' . The element p cannot have two distinct 
notations of the form (4). since by subtracting one from the other we 
would find that there exists an equation over the field P satisfied 
by the element a. but this is in conflict with the transcendental 
nature of this element. Combining the elements of type (4) by the 
rules of addition in the ring L, it is of course po.ssible to combine 
coefficients of like powers of a; but this coincides with the rule for 
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adding polynomials. On the other hand, by multiplying elements 
of form (4) by the rules for multiplication in the ring L, we can, 
using the distributive law, perform termwise multiplication and 
then collect like terms. This evidently leads to the familiar law of 
multiplication of polynomials. This proves that elements of the 
type (4) constitute, in the ring L, a subring containing the field P 
and the element a (that is, a subring coinciding with U), and that 
this subring is isomorphic to the polynomial ring P la:). 

We see that the choice of definitions for operations on polyno- 
mials we made above was not accidental; it is fully determined by 
the fact that the element x of the ring P [j] must be transcendental 
over the field P. 

Note that in constructing the polynomial ring P [a:) we never 
used the division of elements of the field P and only once (namely, 
in proving the assertion on the degree of a product of polynomials) 
had to refer to the absence of zero divisors in the field P. It is there- 
fore possible to take an arbitrary commutative ring L and, repea- 
ting the foregoing construction, derive a polynomial ring L la:] over 
the ring L; if in this case the ring L does not contain divisors of zero, 
the power of the product of the polynomials will be equal to the sum 
of the powers of the factors and therefore the polynomial ring L [x] 
will not contain divisors of zero either. 

Returning to polynomials with coefficients from an arbitrary 
field />, notice that actually the entire theory of divisibility of 
polynomials (de.scribed in Secs. 20-22 of this book) is carried over 
lo t his case. Namely, in the ring P [x] we have the division algorithm, 
and both the quotient and the remainder will themselves belong 
to the ring P \x\. Also, the concept of a divisor is meaningful in ike 
ring P (a-j and all its basic properties are preserved. The fact that the 
division algorithm does not lake us outside the base field P, permits 
u.-j to assert that the property of a polynomial rp (x) to be a divisor of 
/ (j) does not depend upon whether we consider the field P or any exten- 
sion of it. 

Also preserved in the ring P [.r| are the definition and all the proper- 
ties of a greatest common divisor, together with the Euclidean algorithm 
and the theorem proved in Sec. 21 with the aid of this algorithm. Notice 
(hat since the division algorithm is, ns we know, independent of the 
choice of the base field, we can assert that the greatest common divi- 
sor of two given polynomials is likewise independent of whether we con- 
sider ^ the field P or an arbitrary extension of it. P. 

Finally, for polynomials over the field P, the concept of a root is 
meaningful and the basic properties of roots hold true. Likewise pre- 
served is the theory of multiple roots. Incidentally, we will return 
to this question at the end of the next section. 

'Hie.se remarks will enable us, in our subsequent study of poly- 
nomials over any field P, to refer to Secs. 20-22. 
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48. Factorization of Polynomials 
into Irreducible Factors 

On the basis of the theorem on the existence of a root, we- 
proved in Sec. 24 the existence and uniqueness of factorization of a 
polynomial into irreducible factors for fields of complex and real 
numbers. These results are particular cases of general theorems 
referring to polynomials over an arbitrary field P. The present sec- 
tion is devoted to this general theory, which parallels the theory 
of the prime factorization of integers. 

First let us define those polynomials whicli play the same role 
in the polynomial ring as primes play in the ring of integers. We 
stress from the start that in this definition we deal solely with poly- 
nomials whose degree is greater than or equal to unity. This is in 
full accord with the fact that in the definition of prime numbers and 
in the study of the factorization of integers into prime factors, the 
numbers 1 and —1 are ruled out. 

Suppose we have a polynomial j (i) of degree //. n'^\. with 
coefficients from the field P. By Property V, Sec. 21, all polynomials 
of zero degree are divisors of / (x). On the otiier hand, by Property 
VII, all polynomials cj (x). where c is a nonzero element of P. will 
also be divisors of / (x); note that the.se polynomials exhaust all 
the divisors (with degree n) of the polynomial / (.r). .\s to divisors 
(of / (x)) whose degree is greater than 0 but less than n. it will be seen 
that they may or may not be in the ring P (xl. In the former case, 
the polynomial / (x) is called reducible in the field P (or over the 
field P), in the latter case, irreducible over this field. 

Recalling the definition of a divi.<or. we may .‘^ay that a polyno- 
fn,ial f (x) of degree n is reducible over the field P if it can be factored 
over this field {i. e., in the ring P (xl) into a product of two factors 
of degree less than n: 

/ (x) = tp (x) ij) (x) (1) 

and f (x) is irreducible over the field P if in any factorization of the 
type (1), one of its factors is of degree 0 and the other is of degree n. 

Note particularly that one can speak of reducibility or irredu- 
cibility of a polynomial only as regards a given field P, since a poly- 
nomial that is irreducible over one field may prove to bo reducible 
over some extension P of that field. Thus, the polynomial .r- — 2 
with integral coefficients is irreducible over the field of rational 
numbers: it cannot be factored into a product of two linear factors 
with rational coefficients. However, this polynomial is reducible 
over the field of real numbers, as the following equation shows: 


x^-2 = (x- V2) (x + 1^2) 
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The polynomial x- + 1 is irreducible not only over the field of ratio- 
nal numbers but also over the field of real numbers. It becomes redu- 
cible however in the field of complex numbers, since 

a:2 -1- 1 = (x - i) + i) 

Let us point to certain basic properties of irreducible polyno- 
mials, bearing in mind that we will be .speaking of polynomials 

irreducible over the field P. 

(a) Amj polynomial of degree one is irreducible. 

This is rather evident since if the polynomial could be factored 
into a product of factors of lower degree, then they would have to 
be of degree 0. But the product of any polynomials of zero degree 
is again a polynomial of zero degree and not first degree. 

(b) If a polynomial p (x) is irreducible, then any polynomial cp (x), 
where c is a nonzero element of P, is also irreducible. 

This property follows from Properties I and VII of Sec. 21. It 
will permit us. where necessary, to confine our consideration to 
irreducible polynomials whose leading coefficients are unity. 

(c) If f (.t) is an arbitrary polynomial and p (.r) is an irreducible 
pvlynnmial, then either f (x) is divisible by p (x) or the polynomials 
are coprime {relatively prime). 

If (/ (‘0- p {j:)) = d (x), then d(x), being a divisor of the irreducible 
polynomial p (x) is either of degree 0 or is a polynomial of the form 
cp (x). c 0. In the former case, / (x) and p (x) are coprime, in the 
latter, / (./) is divisible by p (x). 

(d) If the product of the polynomials f (x) and g (x) is divisible by an 
irreducible polynomial p (.r). then at least one of these polynomials 
is divisible by p (x). 

Indeed, if / (x) i.s not divisible by p (x), then, by (c), / (x)and 
p (x) are coprime, and then, by Property (b) of Sec. 21, the poly- 
nomial g (.r) must bo divisible by p (x). 

Properly (d) is readily carried over to the case of a product of any 
finite number of factors. 

The two theorems which follow are the main purpose of this 
whole section. 

Any polynomial f (x) in the ring P (xl having degree «, n > 1, 
can be factored into a product of irreducible factors. 

Indeed, if a polynomial / (x) is itself irreducible, then the indi- 
cated product consists of only one polynomial. But if it is reducible, 
then it can be factored into a product of factors of lower degree. If, 
among the.se factors, we again find irreducibles, then we decompose 
them into factors again, etc. This process will cease after a finite 
number of steps, since in any factorization of / (x) into factors, the 
sum of the degrees of tiie factors must bo equal to n and therefore 
the number of factors dependent on x cannot exceed n. 
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The factorization of integers into prime factors is unique if we 
confine our consideration to positive integers. However, in the 
ring of all integers, uniqueness only occurs to within sign: thus. 
— () = 2-(-3) = (— 2)-3, 10 = 2-5 - (-2)-(-5) and so on. 

A similar situation obtains in the polynomial ring as well. If 

/ (a:) = Pi (x) Pi (x) . . . Ps (x) 


is a factorization of the polynomial / (x) into a product of irreducible 
factors and if the elements c,, from the field P are such 

that their product is equal to 1, then 

/ (X) = [CiPi (x)]-koP2 (^)i 

will also, by (b). be a factorization of / (x) into a product of irre- 
ducible factors. It turns out that this exhausts all factorizations 

^ polynomial / (x) jrom a ring P (xl can be decomposed in two 
ways into a product of irreducible factors: 

f (x) = p, (x) Pi (x) . . . Ps (x) = qi (x) Qi (x) . . . qi (x) (2) 

then, s = t, and, with appropriate numbering, we have the equalities 

qi (x) = Cipi (x), / = 1. 2 s (3) 


where Cj are nonzero elements from the field P. 

This theorem holds for polynomials of degree one. since they 
are irreducible. We will therefore argue by induction with resi)ect 
to the degree of the polynomial, that is, we will prove the theorem 
for f (x), assuming that for polynomiahs of lower degree it is already 


proved. _ , 

Since g, (x) is a divisor of / (x). it follows, by Properly (d) and 

equality (2) that g, (x) will be a divisor of at least one of the poly- 
nomials Pi (x), say of Pi (x). However, since the polynomial p, (x) 
is irreducible and the degree of r/i (x) is greater than zero, there exists 


an element ci sucli tliat 


g, (x) = CiPi (x) 



Substituting this expression of q\ (x) into (2) and cancelling pi (i) 
(which is permissible since there are no zero divisors in the ring 
P Izl), we obtain the equation 

Pi (x) pz (x) . . . Ps (x) = lcig2 (x)] qz (x) . . . qt (x) 

Since the degree of the polynomial equal to these products is lower 
than that of / (x), then it is already proved that s — 1 = / — 1, 
whence s — t, and there exist elements Cj, C3, . . c, such that 
c’ip. (z) = CiQi (z), whence q. (x) = (c"»c;) Pi (z) and apt (x) = 
= q^ (j), i = S s. Assuming c"'c; = Cj and taking into ac- 

count (4), we get the equations (3) completely. 
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The theorem we have just proved may be stated more succinctly: 
every polynomial may be uniquely decomposed into irreducible factors 
to within zero-degree factors. 

Incidentally, it is always possible to consider the following 
special type of factorization which will be quite unique for every poly- 
nomial: take any factorization of the polynomial / (x) into irreducible 
factors and factor out of each the leading coefficient. We get the 
factorization 

/ (x) = floPi (a:) Pz ( 5 :) . . . Pa (x) (5) 


where all the pi (x), i = 1, 2 s, are irreducible polynomials 

with leading coefficients equal to unity. The factor Oo will be equal 
to the leading coefficient of the polynomial / (x), as can readily be 
verified by nuilliplying out the right member of (5). 

The irreducible factors in (5) do not necessarily have to be di- 
stinct. If an irreducible polynomial p (x) appears several times in 
the factorization (5). it is called a multiple factor of / (x), namely 
a k-jold (double, triple, etc.) factor if (5) contains exactly k factors 
equal to /; (.r). But if the factor p (x) appears in (5) only once, then 
it i.s called a .'dimple (or single) factor of / (x). 

If in the factorization (5) the factors pi (x), p^ (x) pi (x) 

are distinct and any otlier factor is equal to one of them and if (x), 

/ ” 1. 2 /. is a /ij-fold factor of the polynomial / (x), then 

(.’)) may be rewritten a.s 

/ = floPf* (-r) pI- (x) . . . (x) (6) 


Tills is tlie notation that we will ordinarily make use of without spe- 
cifying that the exponents are equal to the multiplicities of the 
corresponding factors, i.e., that pf (j) ^ py (x) for i # /. 

If we are given the factorizations of the polynomials f (x) and g (x) 
into irreducible factors, then the greatest common divisor d {x) of these 
polynomials is equal to the product of the factors appearing in both 
faclurizations at the same time, and each factor is taken to the power 
equal to the least of its multiplicities in the two given polynomials. 

Indeed, the indicated product will he a divisor of each of the 


polynomials / (x), g (x) and therefore al.'^o of d {x). If this product 
were different from d (x). tlien the factorization of d (x) into irredu- 
cible factors would either contain a factor that does not appear in 
the factorization of at least one of the polynomials / (x) and g (x), 
wliich is impossible, or one of the factors would have a higher power 
than it has in the factorization of one of the polynomials / (x) and 
g (.r), which is again impossible. 

This theorem is similar to the rule ordinarily used to find the 
greatest common divisor of integer.s. However, in the case of poly- 
nomials, it cannot replace the Euclidean algorithm, for, since there 
is only a finite number of primes less than a given positive integer, 
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the factorization of an integer into prime factors is attained by 
a finite number of trials. This is not the case in a polynomial ring 
over an infinite base field, and. in the general case, one cannot 
offer a method for factoring polynomials into irreducible factors. 
What is more, it is very hard even to decide in the general case the 
question of whether a polynomial / (x) is irreducible over a given 
field P. Thus, the description of all irreducible polynomials for the 
case of the fields of complex and real numbers was obtained in Sec. 24 
as a corollary to a very profound theorem on the existence of a root. 
As to the field of rational numbers, only a few assertions of a spe- 
cific nature concerning polynomials that are irreducible over this 
field will be made in Sec. 56. 

We have shown that in the polynomial ring (as in the ring of 
integers) we have a factorization into “j)rime” (irreducible) factors 
and that this factorization is in a certain sense unique. The question 
arises as to whether it is possible to carry over these results to broader 
classes of rings. We confine ourselves here to the case of such commu- 
tative rings as have a unit element and do not have divisors of zero. 

We will use the term divisor of unity for an element a of the ring 
such that in this ring there exists an inverse element a“^ 

aa~^ = 1 

In the ring of integers, these are the numbers 1 and —1. in the ring 
P [x] of polynomials, all the polynomials of zero degree (that is, 
nonzero numbers from the field P). The element c, which is nonzero 
and is not a divisor of unity, will be called a prime element of the 
ring if in any decomposition of it into a product of two factors, 
c = ab, one of the factors is invariably a divisor of unity. In tlie 
ring of integers, the prime elements are prime numbers, in the poly- 
nomial ring they arc irreducible polynomial.s. 

Will every eiernent of the ring under consideration that is non- 
zero and is not a divisor of unity be decomposable into a product 
of prime factors? If it is, will the factorization l)e unique? Tliis is 
to be understood as follows: if 

a = PiPi . . . Ph — •••*?/ 

are two factorizations of the element a into prime factors, then 
k = I and (possibly after a change in the numbering) 

(] j — pjCj, i “ 2, • • •> k 

where ci is a divisor of unity. 

It turns out that in botli instances the answer is no. We give 
one example, namely, wo indicate a ring in which factorization 
into prime factors is possible but not unique. 

Consider complex numbers of tlie form 

a = a b K— 3 


(7) 
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where a and b are integers. All such numbers form a ring without 
divisors of zero and containing a unit element; indeed, 

(a + b (H-d V‘^^) = {oc-dbd)-'r{bc~ ad) (8) 

We use the term norm of a number a = a — b Y —3 for the positive 
integer 

.V [a] ~ 36- 


By (8), the norm of a product is equal to ike product of the norms 

X (ccp) .V (a) A- (p) (9) 

Indeed, 

(t7c - 3bd)~ -f 3 (be ~ ad)~ = a'c~ -- 96V- -f 36 t= -r 3oV2 

- (a- - 36=) [c- - 3d=) 


If in our ring the number a. is a divisor of unity, that is the num- 
ber a”^ is also of the form (7). thou, by (it)- 

.V (a) -.V : -- A' (aa-‘) - A' (1) = 1 

and therefore A’ (a) ^ 1. since the numbers X (a) and X (a"*) are 

integers and are positive. If a — « i- 6 I then from X (a) = 1 
it follows that 



1 


which, however, is j)ossihle only when 6=0, a = ^ 1. Thus, in 
our riri};. as in the nng of int(’;:ers\ only (he numbers 1 and —1 will 
he dirisors of unilij. and only these numbers hare a norm equal to unity. 

1 he efjualion (9) bir the norm of a product can naturally be 
e\t(‘n(!ed to the rase of any finite number of factor.'^. It is thus easy 
to roncinde that anji ntimbrr a in our riui' can be faeiored into a pro- 
duct of a finite number of prime factors. \\ e leave the [»roof to the reader. 

Hou'erer. we cannot oss>'rf that the fartm'ization info prime factors 
IS unique. For example, the following equations hold true: 




qualions hold 
= 2-3 - (I ^ I ^)(I - Y^) 


In our ring there are no oUim- divisors of unity except 1 and — 1, 

and so the niimbi-r I 1 — (lilo' the mimher 1 — Y^j) cannot 
differ from tlie number 2 snhdy by a fael.-r which i.'? a divisor of unity. 

It remains to show that each one of the numbers 2, 1 } '~3. 1 — 

— I — 3 will be prime in the ri/ig uniler cimsideration. Indeed, the 
norm of each of the.se thnn* numbers is equal to Let a be any one 
of these numbers and let 


a - Pv 


Then, by (9), one of the following three cases is possible: 
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(1) N (P) = (y) = 1; (2) N (P) = 1, iV (y) = 4; (3) N (p) = 

= N iy) — 2. In the first case, the number y will, as we know, be 
a divisor of unity; in the second case, p will be a divisor of unity. 
The third case is impossible due to the impossibility of the equality 

a2 + 36’' = 2 

where a and b are integers. 

Multiple factors. Although, as has been demonstrated above, 
we are not able to decompose polynomials into irreducible factors, 
there exist methods which enable us to determine whether a given 
polynomial has multiple factors or not and, if it does, to reduce 
the study of that polynomial to the study of polynomials that do not 
contain multiple factors. True, these methods require that we impose 
certain restrictions on the base field. In the rest of this section we 
will assume that the field P has characteristic 0. Without this 
restriction, the theorems on multiple factors that will be proved 
below break down. At the same time, the case of fields of characte- 
ristic zero is the most important one from the viewpoint of appli- 
cations since, for one thing, all number fields are included here. 

To begin with, notice that we can e.vtend to this ca.«G both the 
concept of a derivative of a polynomial (introduced in Sec. 22 lor 
polynomials with complex coefficients) and the basic proper! ie.« 
of this concept.* Let us now prove the following theorem. 

// p (^) is a k-fold irreducible factor of the polynomial f (i), /r > 1 , 
then it will be the (k — \)-fold factor of the derivative of this poly- 
nomial. In particular, a prime factor of the polynomial does not enter 
into the factorization of the derivative. 

Indeed, let 

/ ( 2 -) = p'* (-r) g {x) (10) 

g{x) is no longer divisible by p (x). Differentiating (10), we get 

/' (x) = p** (j:) g' {x) -f- kp'*-^ (z) p' (x) g (x) 

= p'‘"' (^) Ip (^) s' (^) ^P' (^) S (-r)! 

The second term in the brackets is not divisible by p (x); iiuiee.i. 
g (x) is not divisible by p (x) by hypothesis, p (x) is of lower degree, 
i.e., it is not divisible by p (x) either; hence, due to the irreducibi- 
lity of the polynomial p (x) and Property (d) of this section and 
Property IX of Sec. 21, our assertion follows. On the other liaiul, 
the first term in the sum in the square brackets is divisible by p (.r) 
and so the entire sum cannot be divisible by p (x); which is to say 
that the factor p (x) docs indeed appear in /' (x) with a multiplicity 
of A: — 1. 


• For fields of a finite characteristic, the assertion that the derivative of 

a polynomial of 'degree n is of degree n — 1 fails. 
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From our theorem and from the above-indicated method of fin- 
ding the greatest common divisor of two polynomials it follows that 
if a factorization of the polynomial / (z) into irreducible factors is 
given, 

/ W = (^) pI^ (-p) • - • W (1^) 

tlien the greatest common divisor of f (ar) and of its derivative has the 
following factorization into irreducible factors'. 

U (^). /' W) = p \'- ‘ W Pl ^- ' W • . . P?'- ‘ W (12) 

Avhere the factor pt* ^ (x) should naturally be replaced by unity 
for kj — 1. In particular, a polynomial f (x) does not contain multi- 
ple factors if and only if it is relatively prime to its derivative. 

We now know how to answer the question of the existence of 
imiltiple factors in a given polynomial. What is more, since neither 
the derivative of a polynomial nor the greatest common divisor of 
(wo polynomials depend on whether we are considering the field 

P or any extension P of it, we obtain the following corollary to 
the result that has just been proved. 

If a polynomial f (x) with coefficients in a field P of characteristic 
zero does not have multiple factors over this field, then neither will 

there be any multiple factors over any extension P of the field P. 

In particular, if / (x) is irreducible over P and P is some exten- 
sion of P, then, although / (.r) can be reducible over P, it will defi- 
nitely not be divisible by the square of an irreducible (over P) 
polynomial. 

Isolating multiple factors. If we have a polynomial / (x) with 
the factorization (11) and if by di (x) we denote the greatest com- 
inon divisor of / (x) and of its derivative f (x), then (12) will be a 
farlorization of c/j (.r). Dividing (11) by (12), wc get 

= "o/’' w (-f) pO-t) 

lhal is. wo obtain a polynomial without multiple factors, and any 
irreduoihle fiictor of I'l (.r) will also be a factor of / (x). In this way, 
finding the irreducible factors of / (x) is reduced to finding them for 
tile polynomial vi (x) which, generally speaking, is of lower degree 
and, at any rate, contains only prime factors. If the problem is 
solved for f-j (x), then it only remains to determine the multiplicity 
of the irreducible factors found in / (x); this is done by means of the 
division algorithm. 

A more sophisticated variant of lliis method enables us to con- 
sider several polynomials without multiple factors; also, having 
found the irreducible factors of these polynomials, we not only 
find all the irreducible factors of / (.t), hut also their multiplicities. 
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Let (11) be a factorization of / (j:) into irreducible factors, the 
greatest multiplicity of the factors being 5 , s > 1. Denote by F\ (x) 
the product of all single factors of / (x), by F., (x) the product of all 
double factors, but taken only once at a time, and so forth; finally, 
denote by F^ (x) the product of all 5 -fold factors taken once at a 
time, as before. If under these conditions, for some j in / (x), there 
are no /-fold factors, set Fj (x) = 1. Then / (x) will bo divisible by 
the A-'th degree of the polynomial F|^ (x), k =1. 2. . . 5 , and 

the factorization (11) becomes 

/ (x) = aoFi (x) /I (x) F] (x) . . . F'l (x) 

and the factorization (12) for di (x) = (/ (x), /' (x)) will be rewrit- 
ten as 

(X) = F, (x) n (^) • • • (-f) 

Denoting by d^ (x) the greatest common divisor of the polynomial 
di (x) and of its derivative, and generally by d^ (x) the greatest com- 
mon divisor of the polynomials (x) and dh_, (x), we obtain 
in the same fashion 

d., (x) = F 3 (x) / J (x) ... FF' (x), 
ds (-r) = ^'4 (-r) fl (-f) - • • (x). 


Whence 


d,-, (x) = F, (x), 
ds (X) = 1 

^’3 (^) . . ■ /■’* (J^). 

^ ^ 




and, therefore, finally, 

yj (4 


Fi (X) = 




F. (x) 


^2 (j) 
‘’3 (-f) 


8 (x) — L's (x) 


Tims, using only procedures that do not require a knowledge 
of the irreducible factors of the polynomial / (x), namely, taking 
the derivative, using the Euclidean algorithm and the division 

algorithm, we can find the polynomials F{ (x), F^ (x) Fg (x) 

without multiple factors; every irreducible factor of the polvnomial 
(x), k = i, 2, .... 5, will be /c-fold for / (x). 

19-U86 
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This method cannot, of course, be regarded as a procedure for 
factoring a polynomial into irreducible factors, since for the case 
of 5 = 1 (that is, for a polynomial without multiple factors) we 
only get / (x) ^ f i (x). 

49. Theorem on the Existence of a Root 

Quite naturally, the fundamental theorem (proved in Sec. 23) 
on the existence, for every numerical polynomial, of a root in the 
field of complex numbers cannot be extended to the case of an arbi- 
trary field. In this section we will prove a theorem which in the 
general theory of fields replaces to some extent the afore-mentioned 
fundamental theorem of the algebra of complex numbers. 

Let there be given a polynomial / (x) over a field P. A natural 
question arises: if the polynomial / (x) docs not have any roots at all 

in the field P, then does there exist an extension P of P in which there 
will he at least one root of / (x)? We can assume that the degree of 
the polynomial / (x) is greater than unity: the question is meaning- 
le.ss for a zero-degree polynomial, and every polynomial of degree 

one, ax -i- b, has the root — in the field P itself. On the other 

hand, we can evidently confine ourselves to the case of / (x) being 
irreducible: if it is reducible over P, then the root of any one of its 
irreducible factors will be a root of / (.r) itself. 

The answer to the que-^ition tliat interests us is given by the 
following theorem on the existence of a root. 

For pvenj polynomial f (x) that is irreducible over the field P there 
is an extension of the field such that contains a root of j (x). All mini- 
mal fields containing the field P and a root of this polynomial are 
isomorphic among themselves. 

Let us first prove the second part of the theorem. 

Suppo.'Je we have a^polynomial irreducible over P: * 

/ (X) OqX^' -1- (7,X’^'1 + . . . -b rtn^iX + (7„ (1) 

and n > 2. that is, / (x) has no roots in the field P it.'^elf. Suppose 

that there is an extension P of P which contains a root a of / (:r). 
r^el us prove the following huuma which will bo needed later on but 
which is o{ interest in itself. 

If a root a, in P, of a polynomial f {.v) which is irreducible over 

P serves also as a root of .Si>me polynomial g (.r) in the ri7ig P [xl then 

( (x) will be a divisor of g (x). 

Indeed, the polynomials / (x) and g (x) over the field P have 
a common divisor x — a and .«o are not relatively prime. The pro- 
perty of polynomials not to be relatively prime does not, however, 
depend on the choice of the field. It is therefore possible to pass 
to the field P and apply Properly (c) of Sec. 48. 
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Now let US find the minimal subfield P (a) of P which contains 
the field P and the element a. It definitely includes all elements of 
the form 

^ = bo + 6, a -r ~ . . . - (2) 

where bo, hi, hj, . . hn-i are elements of P. No element of P can 
have two distinct notations of the form (2); if it is also true that 

p = Co + Citt 4 - 4 - ... 4 - c„_,a’'“^ 

and for at least one k, bh- then a will be a root of (he polynomial 
g (z) = (6o - Co) 4 - (hi - Cl) X 4 - (ha - Co) 

4 - . . . ~ (h,., - c,.,) x-'-i 

which runs counter to the lemma proved above since the dogivt* 
of g (x) is lower than the degree of / (x). 

The elements of the field P having the form ( 2 ) include all (he 
elements of the field P (for 6i = h., = . . . = 'bn.\ = 0 ), and also 
the element a itself (for h, = 1, bo ~ bo — ... — bn.i = 0). 
We now prove that elements of the form ( 2 ) constitute the entire soughi- 
for subfield P (a). Indeed, if we are given elements p [with notation 
(2)1 and 

V = Co 4- Cjtt 4- CoO.- -f , . . 4- Cn-iCt’‘“* 
then, on the basis of the properties of operations in llie field P, 
P ± V = (ho ± Co) + (h, ± Cl) a (h.. dz c^) a" 

+ • • • + (h«-i ± c„.|) 

That is to say, the sum and difference of any two elements of tin- 
type (2) are again elements of tliat type. 

If we multiply p and y, we get an expression containing a" and 
other higher powers of a. However, it follows from ( 1 ) and the equa- 
lity / (a) — 0 that a" and therefore and so on can In- 

exprc.ssed in terms of lower powers of the element o.. The simplest 
way of finding an expression for Py is this: let 

(x) = ho -r hjX -• ... -i' h^-jx"'*, 

ij:- (x) = Co -1 c,x -t- . . . 4 - c„.,x''"* 

whence (p (a) = p, 4: (a) = y. Multiply the polynomials 9 (x) and 
^ (x) and divide the product by / (x). This yields 

(x) i|j (x) / (x) 7 (x) 4 - r (x) (3) 

where 

r (x) = do diX -] ... 4 - ^h.-|X"■* 

Taking the values of both sides of ( 3 ) for x = a we find that 

(p (a) 43 (a) = f (a) q (a) 4 - r (a) 


19* 
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That is to say, by / (a) = 0, 

Thus, the product of two elements of the type (2) will again be an 

element of this type. , a 

Finally, we will show that if element p is of the type (2), p 0, 

then the element P'^ existing in the field P can also be written as 
(2). To do this, take the polynomial 

9 (a:) = 6 o + + . . . -r 

in the ring P Ul. Since the degree of rp (j) is lower than the degree 
of f {x), and the polynomial / (x) is irreducible over P, it follows that 
9 (x) and / (x) are relatively prime and therefore, by Secs. 21 and 
47, there exist in the ring P [x] polynomials u (x) and v (x) such 
that 

9 (x) It (x) ~r f ix)v (x) = 1 

We can assume here that the degree of u (x) is less than n: 

U (.X) = So + "T • • • + 

Whence, by / (cc) = 0, it follows that 

9 (a) u (a) - I 

and tliercfore, by the equality 9 (a) = p, we have 

p-' = n (a) = So -f- Sict — . . . + 

Thus, the collection of elements of the field P having the form 

(2) constitutes a subfield of P, wliicli is tlie desired field P (a). Fur- 
thermore. since we saw that in .'keeking tlie sum and product of the 
elements p and y of llio type ( 2 ) we need only know the coefficients 
of the expressions of tliese elements in terms of powers of a, we can 

assort the tnilli of the following result. If besides P there is another 
extension P' of the field /*, which also contains a root a' of tlm 

polynomial / (x). and if P (a) is a minimal subfield of the field P’ 
contnining P and a then the firhh P (-:<) ond P (a') are isomorphic. 
To obtain the isoinor]dnc correspondence between them, it is neces- 
sary to associate with the element p of type (2) in P (a) an element 

P^ = fco + -f fyP- -f . . . -f- 

in P (a') having tlie same coefficients. This completes the proof 
of the second part of the theorem. 

Let ns now prove the basic first ]>ait of this theorem. The fore- 
going will help to point the way. We liavo a polynomial / (x) of de- 
gree n'^2 that is irreducible over tiio field P and it is required to 
construct an extension of P containing a root of / (x). To do this, 
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let US take the entire polynomial ring P \x] and partition it into 
disjoint classes, combining in one class the polynomials which yield 
the same remainders upon division by the given polynomial / (x). 
In other words, the polynomials 9 (x) and ij) (x) belong to tlie same 
class if their difference is exactly divisible by / (x). 

We agree to denote the resulting classes by the letters A. B. C 
and so on and to define the sum and product of classes in the following 
natural manner. Take any two classes A and 5 ; choose in ^ a poly- 
nomial cpi (x), in B a polynomial if, (x) and denote by */, (x) the sum 
of these polynomials: 

'/i (^) = W + ^'1 

and by 0, (x) their product: 

0, (x) = (f, (x)*vl-, (x) 

Now choose any otlicr jiolynomial (p-j (x) in A and any polynomial 
ifa (x) in B and denote by '/,2 W ^^nd 0 . (x) their sum and product. 

respectively: 

■/: (•^') “ T2 + ^'2 t-^)' 

02 (x) = <P2 

By hypothesis, the polynomials (p, (x) and ((., (x) are in the same 
class A and therefore their difference tp, (x) — tp-. (x) is exactly 
divisible by /(x); the difference t|'i (x) — ifi (-0 the same pro- 
perty. From this it follows that the difference 

Xi (a:) - X2 W = 1^1 (^) + 

= [<Pi {^) — *P= + K'l (^) — ^3 (^)I (^) 

is also exactly divisible by the polynomial / (x). This is also true 
of the difference 0, (x) — 0^ (x) since 

©! (x) - 02 (x) = cp, (x) il), (x) - cp, (x) (x) 

= (p, (x) il), (x) - <pi (x) il?. (x) + Ti (x) ^2 (x) - (ps (x) '!’■> (x) 

= <Pl (x) H’l (x) - ^2 {x)l + l‘Pl (x) - <p2 (x)l '1-2 (x) ( 5 ) 

Equation ( 4 ) shows that the polynomials Xi (x) and x- (x) lie 
in the same class. In other words, the sum of any polynomial from 
class A and any polynomial from class B belongs to a very definite 
class C, which does not depend on what polynomials are chosen as 
“representatives” in cla.sses A and B. Wc call this class C the sum 
of the classes A and B: 

C = A A- B 

Similarly, because of ( 5 ), there is a class D which is independent 
of the choice of represonlalivcs in classes A and B and in which lies 
the product of any polynomial of A by any polynomial of B. We 
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call this class the product of the classes A and B\ 

D = AB 


We shall show that the collection of classes into which we have 
partitioned the ring P {xl of polynomials is converted into a field 
after the indicated introduction of the operations of addition and 
multiplication. Indeed, the validity of the associative and commu- 
tative laws for both operations and of the distributive law follows 
from the validity of these laws in the ring P [jI, since operations 
on classes reduce to operations on the polynomials lying in these 
classes. The role of zero is evidently played by the class composed of 
polynomials divisible exactly by the polynomial / (x). We call 
this the zero class and denote it by the symbol 0. The opposite of 
class .4. which is made up of polynomials that yield the remainder 
'[ (x) upon division by / (.r). is the class made up of polynomials 
which yield the remainder — (j (.r) upon division by / (x), whence it 
hdlows that subtraction is unique on the set of cla.sses. 

To prove that division is possible on the set of classes, we have 
to show that tlu're exists a cla.«s jdayiiig the role of unity and that 
lor any class different from zero tliere is an inverse class. The class 
of jolynomiabs wliich ujion division by / (x) yields a remainder 1 
wil obviously be unity. We call this the unit class and denote it 
by the symbol E. 

Xow suppose we have a class A different from zero. A polynomial 
'I fx) chosen in .4 as a representative will thus not be exactly divi- 
sible liy / (.r) anil therefore, because of the irreducibility of / (x), 
these two polynomials are relatively prime. Thus, in tlie ring P [xl 
then* exist polynomials u f./) and v (.r) that satisfy the equation 

<1 (x) u (.r) r / (.r) v (x) ~ i 

whence 

<1 (x) n (x) ^ I _ / (x) r (.r) (6) 

I pon division by / (.c). tlu' ri^ht meinlier of (tl) yields a remain- 
der I. which means it heloniis to tlie unit class E. If the class to 
which (he polynomial u (x) hidongs is denoted by B. then (0) .shows 
I ha I 

.1 />' r- E 


whence B — .4 b Ibis is proof of the existence of ati inverse class 
tor every nonzero class; in other words, this completes the proof 
that classes form a field. 

We will denote thi.^ field by 7^ and will show that it is an e.itension 
‘if the field P. With every element a of the field P is associated a class 
composed of polynomials which upon division by / (x) yield a remain- 
der a; the element a itself, regarded as a zero-degree polynomial, 
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belongs lo this class. All classes of this special type constitute, in 
the field P, a suhfield that is isomorphic to the field P. Indeed, tlie 
one-to-one nature of the correspondence is obvious; on the otlier 
hand, for representatives in these classes we can choose elements 
of the field P and therefore with the sum (product) of elements of 
P is associated a sum (product) of corresponding classes. Consequen- 
tly, in the future we will not need to distinguish between the ele- 
ments of a field P and the classes corresponding to them. 

Finally, use X to denote the class made up of polynomial.s 
which upon division by / (.r) yield tlie remainder x. This class is 

a definite element of the field P. and we wish to demonstrate that 
it is a root of the polynomial / (.r). Let 

/ (x) = UqX^ -r -r . . . H- -h a„ 


We denote by Ai tlie class corresponding, in the foregoing sense, lo 
the clement n,- of the field /^. f = 0, 1, . . .. n. and will find out 
what the element 


AqX^ + (“) 


of the field P is equal to. Assuming elements ai, i - 0. 1 

to be representatives of the classes Ai and tlie polynomial x to be 
a representative of the class X. and using the definition of addition 
and multiplication of cla.«scs. we find that the polynomial f (x) 
is itself contained in class (7). However, / (.r) is exactly divisible 
by itself and therefore class (7) turns out to be the zero class. Thus, 
by replacing in (7) the cla.^.'^es At by the elements Oi of P correspom- 
iling to them, we find that the following equality hoUKs in Hie field P: 


That is lo say, the cla.ss X is indeed a root of the polynomial / (x). 

This completes the jiroof of (he theorem on the existence of a 
root. Note that by taking the field of real numbers for P and setting 
^ -j- 1, wQ obtain yet another method for constructing the 

field of complex numbers. 

Certain corollaries can be derived from the theorem on the exi- 
stence of a root similar lo those derived in Sec. 24 from the funda- 
mental theorem of the algebra of complex numbers. One remark is 
in order fir.st, however. Since any linear factor x — c of a polyno- 
mial / (x) is irreducible, it must appear in the unique factorization 
of / (x) into irreducible factors. 

However, the number of linear factors in the factorization of 
/ (x) into irreducible factors cannot exceed the degree of the poly- 
nomial. We get the following result. 

A polynomial f (x) of degree n cannot have more than n roots in 
the field P, even if each of the roots is counted with its multiplicity. 
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We use the term splitting field of a polynomial / (a:) of degree 
n over the field P for an extension Q oi P such that contains n roots 
of /(a) (counting multiplicity in the case of multiple roots). Con- 
sequently, over the field Q the polynomial / (ar) will decompose into 
linear factors, and no further extension of the field Q can make new 
roots appear for / (a:). 

For every polynomial f (x) in the ring P (xl there is a splitting 
field over the field P. 

Indeed- if a polynomial / (x) of degree n, n > 1, has n roots in 
the field P itself, then P will be the desired -splitting field. But if 
/ (x) does not decompose into linear factors over P, then we take 
one of its nonlinear irreducible factors ip (x) and, on the basis of the 
theorem of the existence of a root, we extend P to the field P' . which 
contains a root of rp (x). If the polynomial / (x) .«till does not break 
up into linear factors over P\ we again extend the field, thus crea- 
ting a root for one more of the remaining nonlinear irreducible 
factors. In a finite number of .steps we will obviously arrive at the 
splitting field for / (x). 

Quite naturally. / (x) can have many different splitting fields. 
One can prove that all tlie minimal fields containing the' field P 
and n roots of the polynomial / (x) (where n is the degree of the 
polynomial) are isomorphic. However, we will not make use of this 
a.ssertion and will therefore not give the proof. 

Multiple roots. In (he [uvvious section we proved that a polyno- 
mial / (,/•) over a field P of characteristic 0 does not have multiple 
factors if and only if it is relatively prime to its derivative; it was also 
noted that the :ib.«ence. in / (x), of multiple factors over P implies 

liie absence of such factors over an^ extension T of the field P. 

Lot us ap dy this to (he case when P is a splitting field for / (x); 

recalling t le definition of a miltiple root, wo arrive at tlie following 
result. ** 

If a polynomial / (x) over a field P of characteristic 0 does not have 
multiple roots in thr given splitting field, then it is relatirelu prime 
to Its derivative f (x). Conversely, if f (x) is relatively prime to its derP 

)ichh' multiple roots in any one of its splitting 

■Wlumce iu particular, it follows that a polynomial f (x) which 
IS irreducible over a field P of characteristic 0, cannot have multiple 
roots in any extension of the field. This a.ssertion does not hold in 
fields of a finite characteristic. This circumstance plays a perceptible 
role in the general theory of fields. ‘ 

Note in conclusion that /„r an arbitrary ftehi. the Viela formitias 
hold too (son Si-c. 2'.); hero, the roots of the polynomial are taken 
in some splitting field of this polynomial. 
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50. The Field of Rational Fractions 


The theory of rational fractions described in Sec. 25 holds in full 
for the case of an arbitrary base field as well. However, when passing 
from the field of real numbers to an arbitrary field P, the view takcMi 

of the expression as a function of the variable x must be rejected, 

for, as we know, it is not applicable to polynomials. Our job liere 
is to figure out the meaning of these expressions for the case when 
the coefficients belong to an arbitrary field P. More precisely, we 
want to construct a field containing the polynomial ring P fr) 
and in such a way that the operations of addition and multipli- 
cation defined in the new field coincide, as applied to polynomials, 
with the operations in the ring P l.rl; in .‘^horl, the ring P U) must 
be a subring of this new field. On the other hand, any element of 
the new field must be repre.'^entable (in the sense of division as defined 
in this field) in the form of a quotient of two polynomials. As will 
DOW be shown, such a field can be constructed for any P. Wo denote 
it by P (a:) (the unknown is in the parentheses) and call it the field 
of rational fractions over the field P. 

First assume that the ring P (rl is already a subring of some field 
Q. If / (x) and g (x) are arbitrary polynomials from P Ixl , and 
g{x)^ 0, then there is. in the field Q. a uniquely defined element 
equal to the quotient obtained by the division of / (x) by g (x). Deno- 
ting this element by , as is the usual way in the case of a field. 

we can write the following equation on the basis of the definition 
of a quotient: 

= ( 1 ) 


where the product is to bo understood in the sense of multiplica- 
tion in the field Q. It may happen that some quotient.s and 

are one and the same element of Q. The condition for this is- 

the ordinary condition of equality of fractions: 

^ = ^ if <^rid only if f (x) tj) (x) = cp (x) g (x). 

Indeed, if ^ = ^ = then, by (1), 

/ (x) = g (x) a, (p (x) = i|) (x) a 

whence 

/ (x) (x) = g (x) xj) (x) a = g (x) (p (x) 

Conversely, if / (x) tj) (x) = g (x) (p (x) = u (x) in the sense of multi- 

plication in the ring P [xl, then, passing to the field Q, we obtain 
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the equalities 


1{^) 


K (x) _ cp (r) 


\g (x) g (a:) If {x) ^ {x) 

Furthermore, it is easy to see that the sum and product of any ele- 
ments of Q. which are quotients of polynomials in P [a:], can again 
be represented in the form of such quotients^ and the ordinary rules 
of addition and multiplication of fractions hold truei 

( 2 ) 


f(x) ■ f{x)^(x)-\-g{i)<f{x) 

g (x) 1{’ (X) 


g (j) ^ (-r) 

/(jr) (p(.r) _^ /{x)-Cp(x) /gv 

g(i) * il>(x) 

Indeed, multiplying both sides of these equations by the pro- 
duct g {x) t (x) and applying (1), we get equalities which hold true 
in the ring P [xl The validity of (2) and (3) now follows from the 
fact that, thanks to the absence of zero divisors in the field Q, both 
sides of each of the resulting equalities may be reduced by a nonzero 
clement g (x) i]) (a:) without spoiling the equalities. 

These preliminary remarks suggest the path we should take in 
constructing the field P (x). Suppose we have an arbitrary field P 
and over it a polynomial ring P Ul. With every ordered pair of po- 
lynomials / (j-), g (x), where g (x) ^ 0, we associate the symbol 

(jfl j called a rational fraction with numerator / (a:) and denominator 

gf.r). We stress the fact tliat this is only a symbol corresponding 
Id the given pair of polynomials, since, generally speaking, divi- 
sion of polynomials in the ring P \x] itself is impossible, and so far 
ilu> ring P (a-1 is not contained in any field. Even if g (j:) is a divisor 

df / (c), the new symbol should for the time being be distingui- 

slied from the polynomial obtained as the quotient in the division 




f / (-r) by g (,r). 

We now call the rational fractions 


fix) 


and 


fr(^) 


equal. 


(4) 


y . 

tj) (X) 

fix) _(p(x) 
g (j:) ix) 

if in the ring P 1x1 we have the equality / (x) »|; (x) = g (x) cp (x). 
It is obvious that any fraction is equal to itself and that if one frac- 
tion is equal to another, then the second one is equal to the first. 
Let us prove the transitive property of this concept of equality. We 
are given equalities (4) and 

y (z) _ u (x) 

tpfx) 'e(x) ^ 

I'rom the equalities 

/ {x) (^) ^ (^) U). (4 = ^ [^) (^) 
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equivalent to them in the ring P (x) it follows that 

/ (a:) V (x) til (x) = g (x) q (x) v (x) = g (x) u (x) tj) (x) 

and therefore, after cancelling out the nonzero (as the denominator 
of one of the fractions) polynomial (x), we get 

/ (^) v{x) = g (x) u (x) 

whence, by the definition of the equality of fractions, 

/U) ^ u (j) 

e t-^) t-’ U) 

This completes the proof. 

Now let us combine into one cla.ss all fractions equal to some 
one given fraction, and therefore (by virtue of the transitivity of the 
equality) equal among themselves. If one class has even a single 
fraction not contained in another class, then, as follows from the 
transitivity of the equality, these two classes do not have a single 
element in common. 

Thus, the collection of all rational fractions written by means 
of polynomials from the ring P |xl breaks up into disjoint classes of 
fractions equal among them.^elves. We would now like to define 
algebraic operations in this set of chnsses of equal fractions so that 
it becomes a field. To do this, we will define operations on rational 
fractions and will each time verify that the replacement of summands 
(or factors) by fractions equal to them replaces the sum (or product) 
also by an equal fraction. This will enable us to speak of the sum 
and product of classes of equal fractions. 

First, let us make the following remark which will bo used re- 
peatedly in what follow.s. A rational fraction becomes an equal frac- 
tion if its numerator and denominator are multiplied by one and the 
same nonzero polynomial, or reduced by any common factor. Indeed, 

f{x) ^ /(jj/tU) 
glx) 

since in the ring P |x] 

/ (x) ig * (^)1 = g (^) 1/ (^) ft (^)1 

We define the addition of rational fractions by formula (2). 
since from g (x) ^ 0 and >1) (x) 0 it follows that g (x) \j) (x) ^ 0, 

the right member of this formula is indeed a rational fraction. 
Furthermore, if it is given that 

= y (^) _ *fo (J) 

g M So ix) ’ ’I’o t-^) 

that is, 


/ (x) go (^) = g (^) /o (^), (p (x) A-’o (a:) = i|> (x) cfo (x) (6) 
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then, by multiplying both members of the first of the equalities (6) 
by (.2^) ■'I'o (^)’ members of the second equality by g (j) w 
and then adding these equalities termwise, we obtain 

1/ (j) tl) (a-) + g (x) <P (x)l go (x) ifo (x) 

= l/o (x) ifo (x) + go (x) To (x) 1 g (x) 'll (x) 


which is equivalent to tlie equation 

f (x) (^) + ? fp _ /o '^) ^0 (j) 4-go (^) <fo 

g (j:) Ap (I) ?o 't'o (^) 

Thus, if we have two classes of equal fractions, the sum of any 
fraction of one class and any fraction of the other class is equal to 
any other such sum. that is to say, such sums lie in some definite 
third class. This class is called the sum of the two given classes. 

The commutativity of this addition follows directly from (2), 
tlio associativity is proved as follows: 



fix)\l'{x)^g(x)ff{x) 
S {!) U) 



li ji) 
a (a) 


/ (X) It (x) e (I) -I- H (X) <f (x) V (x) -f ^ (x) ^ (x) » W 

S (X) tp tx) 


/(X) . <;{x)v(x)-^\\-{x)u{x) _ fix) r cr(x) u (x) 1 

“ S (x) y\' (X) y (X) g (x) L yp (X) V (X) J 


ITum the definition of equality of fractions it is easy to derive 

tliat all fraclion.'^of the form ^ ,.i.e., fractions with zero numerator, 

are equal and that they form a complete class of equal fractions. 
We call this class the zero class and we will prove that in our addition 

it plays tlic part of zero. Indeed, if we have an arbitrary fraction , 


then 


0 

S {X) 



vp t-c) feMx))t{x) “g(x)iti(x) 


T(x) 


[Tom the equation 

/lx) — / (x) _ 0 

g tx) ‘ g [X) g- (X) 


the right side of which belongs to the zero class, it now follow's that 
the class of fractions equal to the fraction will beoppos//eto the 

class of fractions equal to the fraction . From this, as we know, 


follows the validity of unique subtraction. 

We define multiplication of rational fractions by formula (3); 
since g (ar) \j) (x) =?^ 0, the right member of this formula will indeed 
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be a rational fraction. Furthermore, if 

/(X) ^ foil) cp (X) ^ gpo (X) 

& (^) So U) ’ ^ (^) Ho 

that is, 

/ (^) go (^) = g (^) /o <P (^) ^'0 {-f) = ^ (^) Ho (-r) 
then, by multiplying out these latter equations termwise, we gel 
/ (-r) go (^) H (^) (^) = g (^) /o (-2^) W Ho (^) 

which is equivalent to the equation 

/ (X) tp (X) /o (X’l (fo (J) 

S tJ') H 1^) ^>' 0 1^) Ho (^) 


Thus, by analogy with the above-defined sum of classes, we can speak 
of a product of classes of equal fractions. 

The commutativity and a.'^sociativity of this multiplication follow 
immediately from (3) and the validity of the distributive law is 
proved as follows: 

r / (x) (p (x) -1 u (X) _ / (x) q> (j)-Hg(-r) <P (-r) ^ Mf) 

1. ? (a;) H (^) j ^ 1^) fc' 1-^1 H 1^1 

+ » (x)4-g(i)(p(x) u(x) 

“ g (-r) H (^) 1-^) ^ H t-r) t-^} 

f (x) \|) (x) u (x) u (x) +g (i) f( (X) “ (x) t’fx) _ / (x) U (x) ( p (x) u (x) 
g (X) Ip (X) V- (X) f'(x)i;(x) \|)(x)y(x) 

_ . Jiil) I Hlf) _ 

^ (X) ’ y (X) Tj) (X) ■ y (X) 


It is easy 


to see that fractions of the type 


/if) 

7(-r) 


, i.e., fractions whose 


numerators are equal to the denominators, are equal and constitute 
a separate class. This class is termed the unit class and in our inulli- 
plication plays the role of unity: 


/ (x) ^ (p (x) _ /(xl(p(x) _ (p (x) 

/ (x) * i|) (x) / (X) Ip u) H i-f) 

Finally, if the fraction does not belong to the zero class, 

n 

i.e., f {x) ^0, then there is a fraction • Since 


fix) ^ gjx) _ 7 (x)g(x) 

H(x)*f (X, g (X) / (X) 


and the right member of this equality belongs to the unit class, the 
class of fractions equal to the fraction will bo inverse to the class 
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of fractions equal to . Whence follows the validity of unique divi- 

g(x) 

Sion. 

Thus, the classes of equal rational fractions with coefficients from 
the field P constitute, in our definition of operations, a commutative 
field. This is the desired field P (x). Incidentally, we still have to 
prove that this field which we have constructed contains a subring 
isomorphic to the ring P 1x1 and that every element of the field can 
bo represented as a quotient of two elements of this subring. 

If we associate with an arbitrary polynomial / (x) from the ring 

P 1x1 a class of rational fractions equal to the fraction (among all 

these fractions there are of course fractions whose denominators 
are equal to unity), we obtain a one-to-one mapping of the ring P [x] 
into the field we have constructed. Indeed, from the equality 

/ (x) __ fp (x) 

I 1 

it would follow (hat / (x) ■ 1 = 1 tp (.r). that is to say, / (x) = (p (x). 
This mapping will even be i.^omorphic, as the following equations 
show: 

/(X) ? (X) f(x)-i — sU)-t f(i\ — e{x) 

l ■■ I “ I- t 

f(x) C(x}_ /(x)-?(x) 

1 ■ I ■ 1 

rims, the classes of fractions equal to fractions of form consti- 
tute. in our field, a .subring that is isomorphic to the ring P Ixl. The 
fraction — p can therefore he denoted simply as / (x). And finally, 

since for g{x)^0. the class of fractions equal to the fraction 

is the inverse of the class of fractions equal to the fraction 
it follows from the equality 

/U) 1 _/U) 

\ ' S (x) t; (x) 

that all elemenf.s of our field may he considered (in the sense of 
operations defined in this field) to he quotients of polynomials of the 
ring P [xl. 

Over an arbitrary field P we constructed llie field of rational 
fractions P (x). Using this same method, we can construct the field 
of rational numbers by taking the ring of integers in place of the 
ring of polynomials. Combining these two cases and using the same 
kind of method, we could prove a theorem asserting that, generally, 
any commutative ring without divisor.s of zero is a suhring of some 
field. 



CHAPTER 11 


POLYNOMIALS 
IN SEVERAL UNKNOWNS 


51. The Ring of Polynomials in Several Unknowns 


One often has to consider polynomials that depend on two, three, 
and, generally, several unknowns. In the first chapters of this book, 
we studied linear and quadratic forms, which are examples of such 

polynomials. Generally speaking, a polynomial / (j-,, 

in n unknowns Xj, x-.. . . x,, over some field P is the sum of a finite 

number of terms of the form .?•}», j-Js, . . .. x'^n, where all A*,- ^ 0. 
with coefficients from the field P. It is assumed, quite naturally, 
that the polynomial / (xi. x.,, . . ., x„) does not contain like terms 
and that only terms with nonzero coefficients are considered. Two 
polynomials in n unknowns, / (j,, x.,> . . x„) and g (j-,. Xj. . . . 

• • Xn) are called equal (or identically equal) if the coefficients of 
like terms are equal. 

If a polynomial / (xi, x.,, . . .. x„) is given over a field P. then 

its degree with respect to the unknown xj, / = 1. 2 n, is the 

highest exponent with which Xj appears in the terms of the polyno- 
mial. By chance, the power may he 0. whicli means that althougli / 
is considered a polynomial in n unknowns xj, x», . . x,. . . ., x^. 

the unknown x; does not actually appear in the notation. 

On the other hand, if we call the number A', -• ~r ... -I- A„ 
(that is, the sum of the exponents of the unknown.s) the degree of the 
term 


Xl X2 



degree of the polynomial f {xiy x^y . . ,, x,,) (that is, the degree 
of the unknowns taken together) is the highest degree of its terms. 
In particular, as in the ca.se of one unknown, only nonzero elements 
from the field P will be polyiiomial.s of degree zero. On the other 
hand, as in the case of polynomials in one unknown, zero will be tlie 
only polynomial in n unknowns whose degree is not defined. Of cour- 
se, a polynomial can in the general case contain .several highest- 




304 


CH. 11. POLYNOMIALS IN SEVERAL UNKNOWNS 


degree terms and therefore one cannot speak of t^ highest-degree 


term of a polynomial. 

The operations of addition and multiplication are defined as 
follows for polynomials in n unknowns over a field P. The sum of the 
polynomials / { 3 : 1 , a:-, • . -i and g {xi, x^) is a polyno- 

mial whose coefficients are obtained by adding the corresponding 
coefficients of the polynomials / and g; if some term occurs in only 
one of the polynomials /, g, then its coefficient in the other polyno- 
mial is naturally taken to be zero. The product of two “monomials" 
is defined by the equation 


ftl 

axi Xo 








Xn 


after which the product of the polynomials / (xi, Xj, . . ., x^) and 
g (xj, x«, . . .r^) is defined as the result of a termwise multiplica- 

tion and sub.sequent collecting of like terms. 

Given this definition of operations, the collection of polynomials 
in n unknowns over the field P becomes a commutative ring, which does 
not contain divisors of zero. Indeed, for n = 1 our definitions coincide 
with those which were given in Sec. 20 for the case of polynomials 
in one unknown. Let it already be proved that the polynomials in 
n — 1 unknowns xi, Xo, . . x„ _i with coefficients from the field 
P constitute a ring without divisors of zero. Any polynomial in n 
unknowns xj, Xn, . . ., x,i.|.Xn may be uniquely represented as a po- 
lynomial in the unknown Xn with coefficients which are polynomials 
in Xi, X.,, . . ., conver.«:ely. any polynomial in Xn with coeffi- 
cients from the ring of polynomials in xi, Xo, . . ., Xn_i over the 
field P may of course be regarded as a polynomial over this same 
liidd P with respect to the entire collection of unknowns Xi, Xj, . . ■ 

- . x„. It may readily be verified that the one-to-one corre- 

spondence we have obtained between the polynomials in fi unknowns 
.iiiil Ibe polynomial.^ in one unknown over the ring of polynomials 
ill — I unknown.'^ i.s isomorphic with respect to the operations of addi- 
tion and multiplication, 'fhe a.<.'?erlion being proved follows now from 
the fact that polynomials in one unknown over the ring of polyno- 
mials in n — 1 unknowns them.'^elve.s constitute a ring, and, as a ring 
of polynomials in one unknown over the ring without zero divisors, 
it does not itself contain any divi.sors of zero (see Sec. 47). 

Consequently, we have proved the existence of a ring of polynomials 
in n unknowns over the field P. This ring is denoted by the symbol 
P Ixi, Xn, . . x„l. 

The following considerations permit regarding the ring of poly- 
nomials in n unknowns from a somewhat different angle. Let a field 
P bo contained in some commutative ring L as a subring. In L take 
n (dements a,, a., . . and find t)»e minimal subring L' of the 
ring L which contains the.se elements and also the entire field P, 
that is, the subring obtained by adjoining the elements aj, a., 
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to the field P. The subring L' consists of all elements of the ring L 
which are expressed in terms of the elements ccj. On and the 

elements of the field P by means of addition, subtraction and mul- 
tiplication. It is easy to see that what we have are precisely those 
elements of the ring L which may be written (with the aid of the ope- 
rations occurring in L) in the form of polynomials in a,, 

with coefficients from P\ these elements, being elements of the ring 
L, will add and multiply precisely in accord with the rules of addi- 
tion and multiplication of polynomials in n unknowns. 

Of course, speaking generally, a given element p of the subring V 
will possess many different notations in the form of a polynomial 
in tti, Oj, . . ., otn with coefficients from the field P. If for any in 
Z/' such a notation is unique, i.e., if the different polynomials in 
a,, ccn are distinct elements of the ring L' (and, hence, of 

the ring L). then the system of elements a,, ctn is called 

algebraically independent over the field P, otherwise it is algebraically 
dependent.* From this wo can draw the following conclusion. 

If the field P is a subring of a commutative ring L and if the sys- 
tem of elements ai, aj a„ of L is algebraically independent over 

P, then the subring t of the ring L generated by adjoining to P the 
elements ai, ttj. . . a„ is isomorphic to the polynomial ring P [xi, 

, Xfi |. 

Of the other properties of the ring P lx,, Xg x„l of polyno- 

mials in n unknowns we indicate the following: this ring may be 
included in the field P (x,, x., . . x„) of rational fractions in n 

unknowns over the field P. Every element of this field can be written 

as where / and g are polynomials of the ring P [x,. Xo, . . x^l; 


then — = if and only if f\b — gf. Addition and multiplication 

of these rational fractions is performed by the rules which, as indi- 
cated in Sec. 45. hold true for quotients in any field. The existence 
proof of the field P (x,, Xo, . . x„) is carried out just as it was in 

Sec. 50 for the case n = 

We can construct a theory of divisibility for polynomials in se- 
veral unknowns that generalizes the theory of divisibility for polyno- 
mials in one unknown, which we studied in Chapters 5 and ID. 
However, since we do not intend to go into a detailed study of tin- 
ring of polynomials in several unknowns, we will confine ourselves 
to the problem of factoring a polynomial into irreducible factors. 

First let us introduce the following concept: if all terms of a po- 
lynomial / (x,, Xg, . . Xn) have one and the same degree s, then 


* The appropriate concepts for the case of n = 1 were introduced in 
Sec. 47: there, an element o, algebraically independent over the field P in the 
sense of the foregoing definition, was called transcendental over P, olherwi-'^o 
it was algebraic over P. 

20-&i(C 
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it is called a homogeneous polynomial or, briefly, a jorm of degree s; 
we are acquainted with linear and quadratic forms, and we could 
consider cubic forms, all terms of which are of degree 3 in the unknowns 
taken together, etc. Any polynomial in n unknowns can be uniquely 
represented as a sum of several forms in these unknowns, the latter 
having various degrees. To obtain the desired representation, all 
we need to do is combine all terms of the same degree. For example, a 
polynomial of degree four / (xi, ar... 

-i- — 64 - xj is the sum of the quartic form x\ — the 

cubic form 3xix: — bxiXoX.-, -- xj. the linear form x. — 2 x 3 and the 
constant term (a form of degree zero) — 6 . 

Let us now prove the following theorem. 

The degree of a product of two ?wi 2 zero polynomials in n unknowns- 
is equal to the sum of the degrees of the polynomials. 

First suppose that we have the forms (p (xj. x», . . Xn) of degree 

.9 and ip (x,. x. x^) of degree t. The product of any term of the 

form rp l)y any term of the form ip will obviously have the degree 

s 4 - t. and so the product qip will he a form of degree s — t, since 
collecting like terms cannot make all the coefficients of this product 
vani.sh due to the absence of divisors of zero in the ring P [xj, X 2 , . . - 

. . • , X/j I . 

If we are now given arbitrary polynomials / (xj. x^ Xn) and 

g (xi, x^ Xn) of degrees s and t, respectively, then, by represen- 

ting each of them as a .sum of forms of different degree.'^, we get 

/ (.Tj, Xn, • • •> Xn) — tp (Xj, Xn, • ■ •, Xn) “r • • •? 

g (Xi' Xn, . . Xn) "'p (X|, Xn, • • •, Xn) 4“ • ■ • 

where ff and i|: are, respectively, forms of degrees s and t, and the 
dots stand for sums of forms of lower degrees. Then 

= T'P • 

By what has been proved, the form (f\|' is of degree s ~ t, and since 
all tt'rms replaced by dots are of lower degree, the degree of the pro- 
duct fg will he equal to s — t. The theorem is proved. 

The polynomial (p is called the divisor of the polynomial /, and / 
is the dividend which is divided by cp. if in the ring P |xt, Xn, . . Xnl 
there is a polynomial ij' .such that / = q\j:. It is easy to see that the 
divisibility properties I-IX (Sec. 21) are preserved in this general 
case as well. A polynomial / of degree k, k >• 1 is called reducible 
over a field P if it can be decomposed into a product of polynomials 
from the ring P Ixi. Xn, . . .. x,J who.«c degrees are less than k. 
Otherwi.se it is an irreducible polynomial. 

Any polynomial in the ring P (xi, Xn .Tnl having a nonzero 

degree can be decomposed into a product of irreducible factors. This 
decomposition {factorization) is unique to within factors of degree zero. 
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This theorem generalizes the corresponding results of Sec. 48 
which refer to polynomials in one unknown. The first assertion is 
proved by repeating exactly the reasoning of Sec. 48. The proof of 
the second assertion is much more difficult. Before attempting it. 
we note that from the second assertion of this theorem there follows 
a corollary: if the product of two polynomials f and g from the ring 
P Ixi, Xnl is divisible by an irreducible polynomial p. then at 

least one of these polynomials is divisible by p. This is so. for other- 
wise we would have, for the product fg. two decompositions into 
irreducible factors, one of which contains p and the other does not. 

Suppose the theorem has been proved for polynomials in n unk- 
nowns and we wish to prove it for a polynomial in n 1 unknowns 
X, Xj, x^, . . Write this polynomial as cp (j). Its coefficients 

will consequently be polynomials in Xi. x» x„. For the.«e 

coefficients the theorem has already been proved, that is to say, each 
of them can be uniquely decompo.«cd into a product of irreducible 
factors. Let us call (p (xj a primitive polynomial (more exactly, pri- 
mitive over the ring P (xj, Xo Xnl), if its coefficients do not con- 

tain a single common irreducible factor, that is to say, are all rela- 
tively prime, and let us prove the following lemma (Gauss’ lemma). 

The product of two primitive polynomials is itself primitive. 

Indeed, suppose we have the primitive polynomials 

/ (x) = a(,x'' + -h . . . -f- flix'*"' + . . . a^, 

g (x) = box^ + fciX*"^ + • ■ • "T bjx'~^ + . . . 6^ 


with coefficients from the ring P |xj, Xj, . . 
/(x)g(x) = Cox'‘-^'-f c,x'‘+'-i+ . . . 


x„l and let 

-|- . . . -f Ck+l 


If this product is not primitive, then the coefficients cq, q, . . . 

• • Ck+i will have a common irreducible factor p = p (x,, Xj, . . . 

• • Xn). Since all the coefficients of the primitive polynomial 
/ (x) cannot he divisible by p, let the coefficient a-, be the first that 
is not divisible by p; similarly, by bj denote the first coefficient of 
the polynomial g (x) that is not divisible by /?, 'Multiplying f {x) 

and g (a:) termwise and collecting terms in 

^1+/ — + • • • ■f^i+l^/-l + <2i+2^y-2+ . . • 

The left member is divisible by the irreducible polynomial p. All 
terms of the right member (except the finsl) are al.«o definitely divi- 
sible by p. Indeed, by the conditions impo.«ed on the choice of i and 
}, all coefficients a/.j, flj- 2 , . . and al.^^o bj.t, bj.^^ ■ . . are divi- 
sible by p. From this it follows that the product a-^bj is also divisible 
by p and therefore, as noted above, at lea.st one of the polynomials 
bj mu.st be divisible by p, which however is not the ca.«e. This 



308 


CH. il. POLYNOMIALS IN SEVERAL UNKNOWNS 


completes the proof of the lemma, under the assumption that the 
fundamental theorem for polynomials in n unknowns holds true. 

As we know, the ring P Ui, Xa, • - arnl is contained in the field 
of rational fractions P {x„ x,, - . x„) which we will denote by Q\ 

^ = P (Xl, Xrt, • • •, X;j) 

Let us consider the polynomial ring Q (x). If the polynomial (p (x) 
belongs to this ring, then each coefficient of it can be represented 
as a quotient of polynomials from the ring P [xi, Xa, . . •, x„l. 
Taking out the common denominator of these quotients and then 
removing the common factors from the numerators, we can represent 
q) (x) as 

Here, a and b are polynomials of the ring P (xj, Xg, . . avil and 
/ (x) is a polynomial in x with coefficients from P Ixi, Xa, . . Xn\\ 
it is even a primitive polynomial since its coefficients do not have 
common factors. 

In this way, we associate with every polynomial cp (x) of the ring 
Q |x] a primitive polynomial / (x). For the given polynomial (p (x), 
the polimomial f (x) is defined uniquely to within a nonzero factor in the 
field P. Indeed, let 

'P(-!-)=y/W = -^?W 

where g (x) is again a primitive polynomial. Then 

adf (x) = beg (x) 

riins. ad and be are obtained by taking out all common factors from 
llie coeflicients of one and the same polynomial over the ring P [xj, 

Xj Xnl. Whence it follows, due to the validity, in this ring 

(on the induction hypothesis), of the unique factorization theorem, 
that ad and be can differ only by a factor of degree zero. Hence, the 
primitive polynomials / (x) and g (x) differ by the .same factor. 

The produet of two polynomials from the ring Q [x] is associated 
with the product of the primitive polynomials corresponding to them. 
Indeed, if 

'f (■■•) = y/W. '!■'(■>■) = g (.r) 
where / (x) and g (x) are primitive polynomials, then 

<p W >l> W = -^ / W g w 

But, as was proved above, the product / (x) g (.r) is a primitive poly- 
nomial. 
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Furthermore, note that if the polynomial (p (j) from the ring Q [rl 
is irreducible over the field Q, then the corresponding primitive polyno- 
mial f (x), regarded as a polynomial in x, Xi,X 2 ,- • ■> Xn- is also irredu- 
cible^ and conversely. Indeed, if the polynomial / is reducible, f = 
= /j/j, then both factors must contain the unknown x, since other- 
wise the polynomial / would not be primitive, whence follows the 
decomposition of the polynomial (f (x) over the field Q: 

<pW = y/W = (y a) a 

Conversely, if the polynomial tp (x) is reducible over Q, (p (x) = 
<pi (x) (p 2 (x), then the primitive polynomials /j (x) and /.^ (x), corre- 
sponding to the polynomials cpi (x) and cp 2 (x), will both contain x. 
but their product, as was proved above, is equal to / (x) (to within 
a factor from the field P). 

Now let us take a primitive polynomial / and factor it into irredu- 
cible factors, / = /i -/c . . . fh- Not only must all these factors contain 
the unknown x, they will even be primitive polynomials, for other- 
wise the polynomial / would not be primitive. Jhis factorization of 
the primitive polynomial f is unique to within factors from the field P. 
True enough, due to the preceding lemma, we can regard this facto- 
rization as a factorization of / (x) into irreducible factors over Die 
field Q, but we already know of the uniqueness of factorization of 
polynomials in one unknown over some field; this uniqueness occurs 
to within factors from Q. However, in our case, due to the primitiNi- 
ty of all factors /i, it will be to within factors from P. 

After these lemmas, proved by induction, the proof of our funda- 
mental theorem does not present any difficulties. Indeed, any irre- 
ducible polynomial in the ring P (x, X|, Xo. - . .. x„l will either 
be an irreducible polynomial from the ring P (xi, Xj, . . . 

. . x„] or an irreducible primitive polynomial. From this it follow.s 

that if we have some factorization of the polynomial cp (x. xj, x.j, . . . 

. . ., x„) into irreducible factors, then, by combining factors, we 

can represent cp as 

Cp (x, Xj, Xji • • •» Xn) = a (xj, Xo, . • •» x„) / (x, Xj. Xo, . . M x,J 

where a is independent ofx.and / i.sa primitive polynomial. However, 
we know that this factorization of (p is unique to within factors from 
P. On the other hand, since for the polynomial a in n unknowns the 
uniquene.ss of factorization into irreducible factors holds by the 
induction hypothesis, and, for the primitive polynomial /, was pro- 
ved in the preceding lemma, the proof of our theorem for the case 
of n -|- 1 unknowns is also complete. 

An interesting corollary stems from the lemmas proved above: 
if a polynomial (p (x) with coefficients in P (x,, Xj, . . ., Xn 1 is reducible 
over the field Q — P (xi, Xo, • • •. Xn) then it can be factored into factors 
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dependent on x and having, as coefficients, polynomials from the ring 
P [jj, Xo’ • • M Xfi]- Indeed, if to the polynomial q) (x) there corre- 
sponds a primitive polynomial / (x), that is, (p (x) = af (x), then, as 
we know, the factorability of / (x) follows from the factorability 
of 9 (x). But this latter fact leads to the factorization of 9 (x) over 
the ring P [x,, Xj Xnl. 

In contrast to the case of polynomials in one unknown, which, as 
we know from Sec. 49, can be factored into linear factors over an 
appropriately cho.sen extension of the base field under consideration. 
there exist over any field P absolutely irreducible polynomials of arbi- 
trary degree in several (two or more) unknowns, that is to say, polyno- 
mials that remain irreducible under any extension of the field. 

Such, for instance, is the polynomial 

/ y) ~ y 

where 9 (x) is an arbitrary polynomial in one unknown over the 
field P. Indeed, if there were a factorization 


{x,y) ^ g [x, y) h (x, y) 

in .some extension P of the field P, then, by writing g and h in terms 
of powers of y, we would have, say, 

g (x. y) = (x) y a^ (x), h (x, y) = 60 (x) 

that is, h is not dependent on i/; and then, because a,, (x) b^ (x) = 1. 

we would have that b^ (x) has degree 0. i.e., h is not dependent 
on x either. 

Alphabetical order of the terms of a polynomial. For polynomials 
in one unknown, w’o have two natural ways of arranging the terms — 
as descending and a.scending powers of the unknown. This is not 
possible for polynomials in several unknowns. If we have a polyno- 
mial of (legn'e five in three unknowns. 


/ (Xi, X., x.i) = xix=x^ -i- xjxa -J- xlx] -- x^x^x"- 
it may also be written as 


Xixtxi -r xjxj 


/ (xj, X2, X3) = xjx3 -f- xjx^xj - 

.and there is no rea.'^on to ])refer one notation to the other. There is, 
however, a very definite way of ordering the terms of a polynomial 
in several unknowns; it depends incidentally on the manner in which 
tile unknowns are numbered. For polynomials in one unknown it 
reduces to ordering the terms in descending powers of the unknown. 
It IS know'n as the alphabetical method. 

Suppose wo have a polynomial / (x,, x. x„) in the ring 

/ Lzj, • • *1 x^i ond two distinct terms of the polynominl 



( 1 ) 

( 2 ) 
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whose coefficients are certain nonzero elements of P. Since the 
terms (1) and (2) are distinct, at least one of the differences of the 
exponents on the unknowns 

A*| — f,-, i — fi 2, . . n 

is nonzero. Term (1) will be considered higher than term (2) [and 
term (2) lower than term (1)1 if the first of the.se differences 
(nonzero) is positive, that is, if there is an i, 1 ^ i n, such 
that 

k'l = li, /i '2 = ^ 2 ' • • •> kj-i = /j-i, but ki >- li 

In other words, term (1) will bo higher than term (2) if the exponent 
on xi in (1) is greater than in (2), or if these exponents are equal but 
the exponent on Xg (0 greater than in (2), and so forth. It will 
readily be seen that from the fact that term (1) is higher than term 
(2) it does not follow that the degree of the former (all unknowns 
taken together) is greater than that of the latter: of the terms 


the first is higher though it is of lower degree. 

It is obvious that of any two distinct terms of the polynomial 

/ (x„ X 2 J^n), one will be higher than the other. It is also easy 

to verify that if term (1) is higher than term (2). and (2), in turn, is 
higher than the term 

x-JUj:”'! . , . x'^'^ (3) 

that is, there exists a 1 ^ ^ such that 

h = m„ /2= m.2, . . Ij-i - inj-i, but Ij > nij 


then, irrespective of whether i is greater than, equal to, or les.s than 
term (1) will be higher than term (3). Thus, placing first that term 
which is higher, we get a definite ordering of the terms of the poly- 
nomial /(xj, x.> Xn), which is called alphabetical. 

Thus, the polynomial 


/ {xu 


2» 




X^) = xj -f 3 x;x->3 — x-xlxl + 5x,X3x;- + Ix.+xlx^ — 4 


is arranged in alphabetical order. 

In the alphabetical notation of the polynomial / (xj. Xj, . . x„) 

one of its terms will occupy first place, that is, will be higher than 
any of the others. This term is called the highest term of the polynomial, 
in the example given above, ar) is the highest terra. We will now prove 
a lemma concerning highe.st terms; it will be used in the proof of the 
fundamental theorem of the next section. 

The highest term of a product of two polynomials in n unknowns is 
equal to the product of the highest terms of the factors. 
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Indeed, suppose we are multiplying the polynomials / (jti, Jo’ • • • 
• • •> •^n ) Und g (^li ^2’ • • •» II 


. . . J*'* 

1m tl 


( 4 ) 


is the highest term of the polynomial / (xj, x^), and 

(7V/x^2 . . . x;« (5) 

is any other term of this polynomial, then there is an /, 1 < i < n, 
such that 

■ -t = Sj-i, k’l > Si 

If, on the other hand, 

( 6 ) 
(7) 


6x'ji.rf2 . . , 


6'x'/.r/2 . . . x'n 

1 1 n 


are the higliest term and any other term of the polynomial 
^ (*^11 • • •» *^n)^ tlien there is a /, 1 ^ such that 


^1’ • • M /;-i — /;-l, Ij >■ / 


J 


Multiplying the terms (4) and (G) and also the terms (5) and (7), 
we get 


'r>ijh-rh , ^ ^ jhn-. In 
I M n ’ 

'2 . . . a'^n-rtn 
I - n 


( 8 ) 


It is easy to see, liowever, tliat term (8) is higher than term (9); 
if, say, i < /, ilipn ' ' 


k 


‘i- = 5,_, “ ti_i but 

'T li > i’i T 


snice /.-■ > Si /; > /, In iho sanie way, we see that term (8) is higher 
tiiaii the product of the terms (4) and (7). and also higher than the 
jy-ndur of (he terms (f.) and {(1). Thus, term {8)-the product of 
thj' luglu st terms of the polynomials / and g-will be higher than 
all Ollier terms ohtained hy termwise multiplication of the polyno- 
mials/ and g. and so this term does not vanisli when we collect terms; 
tliat IS to say, it remains the highest term in the product /g. 

52. Symmetric Polynomiahs 

Conspicuous among polynomials in several unknowns are those 
hat remain unchanged no matter what rearrangements of the un- 
knowns occur, ilius, al unknowus appear in these polynomials in 
symmetric fashion, whence the name sijmrnetric polynomials (or 
symmetric functions). Among the simplest e.xamplcs are the sum of 
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all unknowns x\ x^-\- ... -- the sum of the squares of 
the unknowns x\-\- x\^ . . . xl, the product of the unknowns 
xjXo* • -Xn, and so on. Since any permutation on « symbols can 
bo represented in the form of a product of transpositions (see 
Sec. 3), it is sufficient, when proving the symmetry of a poly- 
nomial, to verify that it remains unchanged under any transposition 
of two unknowns. 

We shall now consider symmetric polynomials in n unknowns 
with coefficients from some field P. It is easy to see that the sum, diffe- 
rence and product of two symmetric polynomials are symmetric', that 
is to say, symmetric polynomials form a subring in the ring 
P (xi. X 21 • • Xn ) of all polynomials in n unknowns over the field P-, 
this is called the ring of symmetric polynomials in n unknowns over the 
field P. It includes all elements of P (that is. all polynomials of deg- 
ree zero and also zero), since they definitely do not change under any 
rearrangement of the unknowns. Any other symmetric polynomial 
invariably contains all n unknowns and even has one and the same 
degree with respect to lliein: if a symmetric polynomial / (xj, x^, . . . 

. . x„) has a term in which the unknown Xj appears with an e.xpo- 

nent k, then it also has a term obtained from the first one by a tran- 
sposition of the unknowns Xi and x>, tliat is. one containing the un- 
known Xj to the same power k. 

The following n symmetric polynomials in n unknowns are called 
elementary symmetric polynomials: 


Xj Xj ... I Xfj , 

(Ta = X1X2 -f XjX 3 + . . . + Xn_iX„, 

^3 ~~ XjX2X3 “f" XJX2X4 " 7 “ . . . ~T' 

• * 

^/»-l “ X1X2 • • • X^j_| “I” XjX2 " • • X/j-jX/j ... X2X3 . . -X^j, 

~ X1X2 • • • Xfl 



These polynomials, whose symmetry is obvious, play a very great 
role in the theory of syminetric polynomials. They are suggested 
by the Vieta formulas (see Sec. 24) and so we can say that the coeffici- 
ents of a polynomial in one unknown, the leading coefficient of which is 
unity, will, to within sign, be elementary symmetric polynomials with 
respect to its roots. This relationship between elementary symmetric 
polynomials and the Vieta formulas will be very essential for tlio.^e 
applications of symmetric polynomials to the theory of polynomials 
in one unknown which justify their study. 

Since symmetric polynomials in n unknowns Xj, x^ .r„ 

over the field P constitute a ring, the following assertions are obvious: 
we have a symmetric polynomial in the case of any positive integer 
power of any one of the elementary symmetric polynomials, also in 
the case of a product of such powers (taken with any coefficient of P). 
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and, finally, in the case of any sum of these products. In other words, 
any polynomial in the elementary symmetric polynomials Oj, Oo, . . . 
. . Gn with coefficients from P, which polynomial is regarded as a 
polynomial in the unknowns will be symmetric. For 

example, set n = 3 and take the polynomial OiO^ -f 203. Replacing 
0,, a. and 03 by their expressions, we get 

OiOo -r 203 = xlx2 -r ^1^3 + “t ^2^3 "i" ^1^ + ^2^ "J" 

What we have on the right is obviously a symmetric polynomial 


in Xi, a:.,, X3. « » ^ 1 1.1- 

An inversion of this result is the following fundamental theorem 

on symmetric polynomials. 

Any symmetric polynomial in the unknowns Xi, Xo, . . x^ oier 
the field P is a polynomial in the elementary symmetric polynomials 

Oi, 0 On with coefficients belonging to P. 

Indeed, suppose we have the symmetric polynomial 


/ (J'l- 




and, in the alphabetical notation, let the highest term be 



The exponents on the unknowns in this term must satisfy the ine- 
(pialities 

ki > A-o > . - . > A'n (3) 

Indeed, suppose, for some i, we have A*i <C Aj+i. However, since the 
jMdynomial / (x,, x., . . x„) is symmetric, it must contain the 

term 




ft • . A ft A 


ftn 



which is obtained from term (2) by a transposition of the unknowns 
.r-, and This is a contradiction, since term (4) is higher than term 
(2) alphabetically: the exponents on xi, x^, . . Xi-i coincide in 
i)olh terms, but the exponent on x^ in term (4) is greater than in 
term (2). 

Let us now take the following product of elementary symmetric 
polynomials [all exponents will be nonnegative because of inequali- 
ties (3)1: 

=r . . . oJ*/i_-l-ftnahn (5) 


riiis is a symmetric polynomial in the unknowns xj, 
and its highest term is equal to term (2). Indeed, the highest terms 

of the polynomials Oj, a.., O3 o„ are equal, respectively, to 

.ri, XiX2. x\xnX3 xix., ... x„. and since it was proved at the 

end of the preceding section that the highest term of a product is 
equal to the product of the highest terms of the factors, it follows 
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that the highest term of the polynomial (p, is 




• • ■ (-^1^2 • • • ^‘n-i) " ^ ” (-^1^2 • • ■ 


= /7-.T*Jt^2 




From this it follows that when we subtract 9, from /. the highest 
terms of these polynomials cancel out, that is, the highest term of the 
symmetric polynomial / — tp, = /j will be lower than the term (2). 
which is the highest one in /. Repeating this same procedure for tin- 
polynomial /,, whose coefficients obviously belong to the field P. 
we get the equality 

/i = *P2 -r fz 

where ipj is the product of the powers of elementary symmetric 
polynomials with a coefficient in P, and is a symmetric polynomial 
whose highest term is lower than the highest term in /,, whence 
the equality 

/ = Ti T2 -r /•: 

Continuing this process, wo get /« = 0 for some s and therefore 
<»rrive at an expression of / in the form of a polynomial in Oi, ... 

• • •> with coefficients in P: 


S 

/ (X„ X21 . . = S (<^ 1 - On) 

i=l 

Indeed, if this proce.ss were endless,* we would obtain an infinite 
sequence of symmetric polynomials: 

til /21 ' • fsi ■ • • ( 6 ) 

and the highest term of each would be lower than the highest terms 
of the preceding polynomials, and all the more so lower than (2). 
However, if 

(7) 


Is the highest term of the polynomial /,. then from the symmetry 
of this polynomial there follow the inequalities 


^ ^ ^ (8) 

which are similar to the inequalities (3). On the other hand, since 
term (2) is higher than term (7), it follows that 

^ (9) 


Ono must bear in mind that, generally speaking, the polynomial (e, 
also contains terms not found in the polynomial /«_, and therefore the transi- 
tion from /^j to f, = fg-i — (Dj is connected not only with eliminating certain 
terms from /,_, but also with the appearance of new terms. Here, s = 1, 2, . . .. 
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It is readily seen, however, that the systems of nonnegative integer.-? 

luh. . - which satisfy the inequalities (8) and (9), may be chosen 

in only a finite number of ways. Indeed, even if we give up the requi- 
rement (8) and only assume that all li, i = 1 , 2, . . do not 
exceed fci, then the choice of numbers will be possible in only 
{ki -r 1)" ways. Whence it follows that the sequence of polynomials 
(6) with strictly descending highest terms cannot be infinite. 
This completes the proof of the theorem. 

The above-indicated relationship between elementary symmetric 
polynomials and the Vieta formulas permits deriving the following 
important corollary from the fundamental theorem on symmetric 
polynomials. 

Let f (x) be a polynomial in one unknown over the field P having the 
leading coefficient unity. Then any symmetric polynomial {with coeffici- 
ents from P) in the roots of the polynomial f (x), which roots belong to 
some splitting field of the polynomial f {x) over P, will be a polynomial 
{with coefficients from P) in the coefficients of the polynomial f (x) and 
therefore will be an element of P. 

The foregoing proof of the fundamental theorem also provide.? 
us with a practical method for finding the expressions of symmetric 
polynomials in terms of elementary polynomials. Let us first intro- 
duce tlie following notation: if 

ox)u^.- . . . (10) 

i.-^ .'^ome product of powers of the unknowns Xi, Xn (some 

of the exponents may be equal to zero), then 

.S . . . a j") (11) 

will denote the .'^um of all terms obtained from (10) by all possible 
rearrangements of the unknowns. It is obvious that this will be a sym- 
metric polynomial and homogeneous too, and that any symmetric 
polynomial in n unknowns containing the term (10) will also contain 
all the otlu-r term.'' of tlie polyjiomial (11). For example. S (xi) = 
(7j, S (.rjj*.^) - 0 .. 5 {j\) is the sum of the squares of all the 
unktiowns. etc. 

Example. Expie^^ tlie symmetric polynomial f = S (j-ix 2) in n unknowns 
in lentis of the eh-mentary symmetric polynomials. 

Here, the highest term i.s rpz and therefore that is, 

‘Fi (-Ti -r X2 -f J-n) (•riJ'2 ‘r J-n-i-Tn) 

= 5 (jiX 2 ) -r 3.S’ {xixzxz) 

whence 

/i = / — Ti = —35 (xi-rnxs) = — 3 o 3 

Therefore, / = cpi -f- /i = 01^2 — SOj. 

In more involved cases, it is advisable first to determine which terms can 
enter into the expression of the given polynomial via elementary polynomials, 
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and then to find the coefficients of these terms by the method of undetermined 
coefficients. 

Example 1. Find an expression for the symmetric polynomial f = S (xfj?). 

We know (see the prooi of the fundamental theorem) that the terms of the 
desired polynomial <f’ (aj, 02, . . a^) are determined via the highest terms 

of the symmetric polynomials /i, /2, • • •, these highest terms being lower 
than the highest term of the given polynomial /, that is, lower than xjxl. We find 
all the products xj^ . . . xJr that satisfy the following conditions: (1) they 
are lower than the term x|x|, (2) they can servo as the highest terms of sym- 
metric polynomials, i.e., they satisfy the inequalities ••• ^ ln> 

<3) with respect to all unknowns taken together they have the degree 4 (since, 
as we know, all the polynomials /,, fz, . . . have the same degree as the homo- 
geneous polynomial /). Writing out only appropriate combinations of expo- 
nents ana indicating, alongside, those products of powers of a which products 
are determined by them, we get the following table: 

22000... o?-2o?-'> = 05, 

21100 ... of-'al-’oJ-'' = ai03, 

11110 ... o|-'oJ->oi->ol'« = 04 
Thus, the polynomial / has the form 

/ = 0| -I- .40,03 + ^04 

We set the coefficient of 02 equal to unity, since this term is determined by the 
highest term of the polynomial / and, as we know from the proof of the funda- 
tnental theorem, has the same coefficient. The coefficients A and B arc found 
as follows. 

Set X, = X2 = X3 = 1 , X4 = . . . = = 0. It IS easy to see that for these 

values of the unknowns the poivnomial / lias the value 3, and the polynomials 
«i, O2, 03 and 04. the values of 3. 3, 1, and 0, respectively. Therefore, 

3 = 9 -h A-Z-i + D-0 

•whence A ~ —2. Now put x, = X2 = = 1. X5 = . . . = x^ — 0. 

The values of the polynomials /, 0,, 02, 03 and 04 will be 6, 4, 6, 4, 1 , respective- 
ly. Therefore, 

6 = 36 — 2 -4 -4 -f- 5 • 1 

whence D = 2. Thus, for / the desired expression is 

/ = 02 — 2o,03 + 2O4 

Kxample 2. Find the sum of the cubes of the roots of the polynomial 

/ (x) = i4 + + 2x2 H- X -J- 1 

To solve this problem, let us find the expression for the symmetric poly- 
nomial S (xj) in terms of the elementary symmetric polynomials. Applying 
the same method as in the preceding e.vamplc, we get the table 

3000 . . . o?. 

2100 . . . 0,02. 

1110 ... 03 

and therefore 

5 (xj) = 0? + A 0,02 + fi03 

First assuming x, = X2 = 1. xs = . . . = x^ = 0, and then t, = X 2 *= 
= xj = 1, X4 = . . . = Xn s= 0, we get A ~ —3, Z? = 3, that is, 

S (xf) c= oj — 30,02 + 3oa 


( 12 ) 
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To find the sum of the cubes of the roots of the given polynomial / (x), it is 
necessary (because of the Vieta formulas) to replace, in the above-found expres- 
sion, Oj by the coefficient of with sign reversed, that is, by — 1 , then to rep- 
lace 02 by the coefficient of x-, that is, by 2 , and, finally, to replace 03 by tho 
coefficient of x with sign reversed, i.e., by —1. Thus, the sum wc are interested 
in (the sum of the cubes of the roots) is equal to 


The reader can verify this result if he takes into account that / (x) has 

1 1/3 1 *1/3 

as roots the numbers /, — i, — ^ — 2 ~~ ^ "2 * obvious 

that the formula ( 12 ) does not depend on the given polynomial / (x) and enab- 

les us to find the sum of the cubes of the roots of any polynomial. 


The method, obtained in the proof of the fundamental theorem, 
for expressing a symmetric polynomial / in terms of the elementary 
polynomials leads to a very definite polynomial in Ui, 05. . . 

It turns out that there is no way of obtaining a different expression 
for / in terms of Ui, Uj, . . On- This is indicated by the following 
uniqueness theorem. 

Every symmetric polynomial has only a unique expression in the 
form of a polynomial in the elementary symmetric polynomials. 

Here is the proof. If a symmetric polynomial / (j:,, x^, . . 
over a field P had two distinct expres.sions in terms of Oi, G2, On 

f • • M — T (^1’ • • •> ^n) “ ^ (^1» ^ • M ^n) 

then the difference 

X (^1) *^25 • • -T G^/i) tp (Gi, Gn) — ijj (Gi, Go, . . Gn) 

would ho a nonzero polynomial in Gi. Go g„; that is, not all 

its coefficients would be zero, whereas replacing Gj, Go g,j 

in this polynomial by their expressions in terms of Xn. . . ., x„ 

would lead to the zero of the ring P \xi, Xo. . . ., x^j. It therefore 

remains to prove that if a polynomial x (g,, Go Gn) is diffe- 

rent from zero, that is, has at least one nonzero coefficient, then 
the polynomial g(x|. Xo, . . ., Xr) obtained from x by replacing 
Oi. Go, - . Gn by their expressions in terms of Xj, Xo, . . x,,, 

X (Gj, Go, . . Gr) = g (xi, Xo, . . Xn) (1^) 

is also nonzero. 

If flGj‘'02* . . . G^(" is one of the terms of the polynomial x- 
(7^0, then after replacing all a by their expressions (1), we get 

a polynomial in X|, Xn Xn whose highest term (in the sense of 

alphabetical ordering) is, as we already know from the proof of the 
fundamental theorem, the term 

ax5' (XiXo)*= . . . (xjXo . . . Xr)^" = rtxi*xi= . . . Xr" 
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where 



Whence 

ki — kn Iji, i 1, iz 1 

That is to say, using the exponents L, ■ • In^ can restore the 
exponents ki, k», . . /i„ of the initial term of the polynomial x- 

Thus, distinct terms of the polynomial x» which are regarded as 
polynomials in xi, x^, . . Xn, have distinct highest terms. 

Let us now consider all the terms of the polynomial */: for each 
one of them let us find the highest term of its represeiilation in tlie 
form of a polynomial in Xj, Xji . . ., x„ and select that highest term 
which is highest in the alphabetical-ordering sense. .As has been 
pointed out above, this term does not have any similar ones among 
the highest terms obtained from the other terms of the polynomial */. 
and since, by hypothesis, it is higher than each of the.se highe.st terms, 
it is all the more so higher than the other terms obtained when repla- 
cing in the terms of the polynomial x elements Oj. Oo o,, 

by their expressions (1). We have thus found a term wliich. when jiass- 
ing from x <72* • • •* <^n) ie ^ (^i* ^ 2 * • • •» ^n)* appears (with 
nonzero coefficient) only once and for this reason cannot be cancelled 
out with anything in any way. Whence it follows that not all coeffi- 
cients of the polynomial g (xi. Xj, . • m x„) are equal to zero, lliat 
is, this polynomial is not a zero element of the ring P (j-,, Xo, . . . 
• . ., Xnl. The proof is complete. 

Evidently, this theorem could also be stated in the following 
manner. 

A system of elementary symmetric polynomials Oj. a.,, . . .. a„ 
regarded as elements of the polynomial ring P \xx, x^. . . ., x^l is al- 
gebraically independent over the field P. 


53. Symmetric Polynomials Continued 

Remarks on the fundamental theorem. The proof of the fundamen- 
tal theorem on symmetric polynomiahs given in the preceding section 
admits of a number of essential supplements to the statement of the 
theorem. We will make use of them in what follows. First of all. 
the coefficients of the polynomial (p (Oi, cTj* • • •• <7„) which we found 

as an expression for the symmetric polynomial / (xi, Xj x„) 

in terms of the elementary symmetric polynomials not only belong 
to the field P, but are even expressed in terms of the coefficients of the 
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polynomial f by means of addition and subtraction^ i.e.^ they belong 
to the ring L generated by the coefficients of the polynomial f inside the 
field P. 

True enough, all coefficients of the polynomial cpi [.^te formula 
(3) of the preceding sectionl in the unknowns xi, Xo, . . are, 

as will readily be seen, integral multiples of the coefficient oq of the 
liighe.st term of the polynomial / and for this reason belong to the 
ring L. Let it be already proved that L contains all coefficients (in 
.Cl, j-.,, . . Xn) of the polynomials (pj, (pj, . . (pi. Then the coeffi- 
cients of the polynomial ft = f — cpi — T 2 “ • • • — T/ 
belong to L. and therefore L also contains all coefficients of the poly- 
nomial (p/+< in Xi, x-i Xn. 

On the other hand, the degree of the polynomial cp (oj, Uj <Jn) 

trilh respect to u.,. . . taken together is equal to the degree of 
the polynomial f {x^. j.>, .... Xn) with respect to each of the unknowns 
X-,. Indeed, .since (2) of Sec. 52, is the highest term of polynomial /, 
it follows that kx will be the degree of / in the unknown x\, and there- 
fore. by .symmetry, in any other of the unknowns x-, as well. However, 
the degree of (p, with re.^pect to o jointly is, by (5) of Sec. .52, equal 
to liie number 

(A'l — k.f) -f {k.. — / 13 ) -r . . . 4- (A'n-i — A'n) H- A'„ = ki 

Furthermore, since the leading term of the polynomial fx is lower 
tlian the leading term of the polynomial /. it follows that the degree 
of /, with respect to each one of the Xi will not exceed the degree of 
/ with respect to each one of these unknowns. However, for fx the 
polynomial cf .j plays the same role as rpt for /, and .^o the degree of 
<1 j with respect to a jointly is equal to the degree of fx with respect 
Id e.ich one of x,\ that is, it does not exceed kx and so on. Thus, like- 
wise. the degree of cp (oi, do. .... o^) does not exceed kx- But since 

tio^' ; with i >» 1 can contain all Oj. a., d„ to the same powers 

(ji. the degree of tp (a,, < 7 ^ js exactly equal to kx- Our 

assertion i.< tiuis proved. 

1 ‘inally. let ( 7 Cf 5 'CTl* . . . afp be one of the terms of the poly- 

iinniial (p (a,, a.- o„). \Ve give the name ''weight" of this 

term to tlie numlier 

/i - 2 L -f . . . -4 nl, 

that is, to the sum of the e.\j)ononts multiplied by tbe indices of the 
corresponding g,. In otlnu* words, tliis is the degree of our terra with 

respect to the unknowns xi. x., x„ taken together, as follows 

from the theorem (proved in Sec. 51) on the degree of a product of 
polynomials. Then the following a.ssertion holds true. 

//. with respect to the totality of unknowns, a homogeneous symmet- 
ric polynomial f (x,, Xo. . . ., x^) has degree s, then all terms of its 
expression if {oi, a . . ., Gn) via g will have the same weight equal to s. 
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Indeed, if (2) of Sec. 52 is the highest term of the homogeneous 
polynomial /, then 

s — 

However, the weight of the term (pi is, by (5) of Sec. 52, equal to 
(ki ~ k^) 2 {k. — ks) + . . . + {n — 1) — A-„) -h nk„ 

= A:, + A*2 -J- A:3 -i- . . . -f- 

That is, it is also equal to s. Fiirtliennore, the polynomial /, = / — 
— (pi, being the difference of two homogeneous polynomials of degree 
s, will itself be homogeneous of degree s, and therefore the term 
of the polynomial (p will have weight s, etc. 

Symmetric rational fractions. The fundamental theorem on 
symmetric polynomials can he e.xtended to the case of rational frac- 
tions. Let us call the rational fraction-^ in n unknowns x,, Xj, . . . 

a 

. . x„ symmetric if it remains equal to itself under any rearrange- 
ment of the unknowns. It is easy to demonstrate that this definition 

does not depend on whether we take the fraction -j or an equivalent frac- 

o 

tion Indeed, if (o is some arrangement of our unknowns, and 9 

is an arbitrary polynomial in these unknowns, then let us agree to 
use (p« to denote the polynomial into which (p is carried by the arran- 
gement ( 0 . By hypothesis, for any (o. 


I 

g 


^( 1 ) 


That is, /g“ = g/“. On the other hand, from 

I = A 

g go 

it follows that /go = ^/o. whence /“gS* = ^/o- Multiplying both 
sides by /, we get 

whence, by cancelling out /«, it follows that /g" = g/« or 

lL = L = h. 

g go 

The following theorem is valid. 

Any symmetric rational fraction in the unknowns Xj, X 2 . . . .. x^ 
with coefficients from the field P can be represented as a rational fraction 
in the elementary symmetric polynomials Oj, do, . . dn with coeffi- 
cients which again belong to P. 
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Indeed, suppose we have the symmetric rational fraction 

/(Xj, Zn) 

S (^li ^2» ■ • • * ^n) 

Assuming it to be in lowest terms, we could prove that both / and g 
are symmetric polynomials. However, a simpler way is the following. 
If the polynomial g is not symmetric, multiply the numerator and 
the denominator by the product of all n\ — 1 polynomials obtained 
from g under all possible nonidentical permutations of the unknowns. 
It is easy to check that the denominator will now be a symmetric 
polynomial. From this it follows, by the symmetry of the entire 
fraction, that the numerator will now also be symmetric, and so to 
prove the theorem all we have to do is e.xpress the numerator and the 
denominator in terms of the elementary symmetric polynomials. 

Power sums. In applications we often encounter the symmetric 
polynomials 

+ A*=:l, 2, ... 


which are sums of the A-lh powers of the unknowns x,, Xj. . . x^. 

These polynomials, called power sums, must be expressed (by the 
fundamental theorem) in termsof elementary symmetric polynomials. 
However, for large k, it is extremely difficult to find these expres- 
sions. and so of intere.st is the relationship between the polynomials 
.9,. .9.,. . . . and o,. Oj. . . ., On, which we will now establi.sh. 

hirst of all. 5| — Oj. Next, if k ^ n, then it is easy to verify the 
truth of llie following equalities: 


— J/, + -S (Xj Vn),* 


S/i—jOo — (x I X2)-f-iS'(Xi “XjXg), 


fc -2 


su_iGi = S . . . Xi) -i- 5 (.r^ . . . XiXi+,) 

= S (•<1X2 • ■ - kotf 


(1) 


Taking the alternating sum of these equalities (that is, the sum 
with alternating signs), and then transposing all terms to one side, 
we get the following formula: 

Sh - + Sk-.a. (-l)'‘-'i-iCTft_i -h (-1)'* kG„ = 0 (2) 

{k < ?i) 


* See (11) of Sec. 52. 
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But if /t > n, then the system (1) of equations takes the form 
Sft-2(y2 — (-^i + •*’2-^3)) 


Sh-iOi 



, h-i+1 

•2:1 



•ri)-r5(j* 'j-2 




1 


1 


Sfc-n^Tn — S (^X'l X2 • ■ • Xn) 

whence follows the formula 

Sfe — Sft-iCTi + Sh-oOs — ... -r ( — 1)" Sk-nOn =0 (A: > «) (3) 

Formulas (2) and (3) arc called Aewton's formulas. They connect 
power sums with elementary symmetric polynomials and permit one 
to find, successively, the expressions for s,. s.,. S3 . . . in terms of 

Ot, Og a„. Thus, we know that s, -- Oi. which also follows from 

formula (2). Furthermore, if A: == 2 ^ n, then, by (2). Sj — SiOi -1- 
+ 2(T2 = 0. whence 

So = oj — 200 

For A = 3 < H we have S3 — SoO, -- s,ao — ‘S 03 = 0, whence, using 
the expressions already found for Si and So, we get 

53 = aj — 30102 4- 303 

which is already familiar to us Isee (12) of Sec. 521. Now if k = 3 
but n = 2, then, hy (3), S3 — SoO, SiOo 0. whence sa = oj — 
— SojOj. Using the Newton formulas, we can obtain a general for- 
mula expressing s;, in terms of Oj. o.> o„. Iriie. this formula 

is very unwieldy and so we will not give it. 

If the base field P has characteristic 0 and for this reason division 
by any natural number n is meaningful*, then formula (2) jiermits 
successively expre.s.^ing the elementary symmetric polynomials 

o,. 02 o„ in terms of the first u power .sums .s,, s„. 

Thus, Oj = ,9i and therefore 

02 = *^ (SiO| — 52)^"^ (^1 

03 = (s, - SjO, H- S1O2) 3s, So 4- 2S3) 

and so forth. From the foregoing and from the fundamental theorem 
follows the result that 


• 111 a field of characteristic p, the oxpres.^ion — is meaningless for a 0 
since in this field px = 0 for any x. 


21* 
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Any symmetric polynomial in n unknowns Xf, Xj, . . x„ over 
a -field P of characteristic zero can he represented as a polynomial in 
the power sums Si, . . .» Sn with coefficients belonging to the field P. 

Polynomials symmetric in two systems of unknowns. In the next 
section, and also in Sec. 58, use will be made of a generalization of 
the concept of a symmetric polynomial. Suppose we have two sys- 
tems of unknowns xj, xj, . . x^ and yj, yj., and suppose 

their union 

2^2’ • • •» Vu 1/2* ' • '1 Vr (4) 

is algebraically independent over the field P. The polynomial 
/ (xi, Xo, . . x„, yi, ^2’ • • •> Vr) over the field P is called symmet- 

ric in two systems of unknowns if it remains unchanged under any arran- 
gements of the unknowns xi, Xo, . . ., Xn among themselves and of 
the unknowns i/i, 1/2, . . ., i/r among themselves. If we denote the 
elementary symmetric polynomials in Xi, X2, . . ., x„ by ... 

. . ., On and the elementary symmetric polynomials in i/,, 1/2, .. . 

. , hy Ti, T2, . . Tr then the fundamental theorem is genera- 
lized as follows. 

Any polynomial f (xj, x.>, .... x^. z/j, y^) over the field' 

P, which polynomial is symmetric with respect to the systems of unknowns 

andy\,y>s, . . .. ih, can be represented as a polynomial 
{with coefficients from P) in the elementary symmetric polynomials 
with respect to these two systems of unknowns: 

f (Xj, Xot • • *1 ^ny l/l> 1/2’ * * ■’ l/r) ^ (^1; ^2’ • • •» "^21 • • • ^r) 

Indeed, the polynomial / may be regarded as a polynomial 

/ (i/i, Ut Ur) with coefficients which are polynomials in x,, 

X2. . . Xn. Since / remains unchanged under rearrangements of the 
unknowns Xi, x^, . . x^, it follows that the coefficients of the poly- 

nomial 7 will be symmetric polynomials in xj. Xo, . . ., x„ and the- 
refore. by the fundamental theorem, can be represented as polyno- 
mials (with coefficientsjrom P) in o,. a^, .... a„. On the other 

hand, the pidynomial / (//,. i/^ regarded over the field 

P (x,. x. x„) will he symmetric with respect to i/j, y., . . 

and therefore can be rejiresented as the polynomial ^ (t,, Tj, ... 

. . ., Tr). The coefticients of the polynomial will, a.s was demonstra- 
ted at Ihejieginning of this section, be expressed in terms of the coeffi- 
cients of / by moans of addition and subtraction, and so they too will 

be polynomials in o,, Oo On. This obviously leads us to the 

desired expression for / in terms of cr,. Oo, . . ., (t„, Ti, Tj, . . Tr- 

Example. The polynomial 

/ (xi, X2, X3, Vi) = X1X2X3 — xixzyt — Xiizy^ — 21X31/1 — Xir3y2 

— X2T3J/1 — x^x^yz -h x,f/,j/2 + X2yiyi -f- xayxVz 
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is symmetric both with respect to the unknowns xi, xz, X3 and to the unknowns 
J/it Vii but is not symmetric with respect to the five unknowns taken together, 
as is evident from, say, a transposition of the unknowns x, and j/j. Let us lind 
the expression for / in terms of a,, Oo, Tj, Tz: 

/ = X1X2Z3 — (xiX2 + X1I3 + xzxa) y\ — + ^^2^3) yz 

+ in + X2 + X3) = 03 - 0:!/i - ^ 2 y 2 + Otyiyz = 03 - OzTi + a,T2 

The theorem just proved can naturally be extended to the case 

of three or more systems of unknowns. 

For polynomials symmetric with re.spcct to two systems of un- 
knowns, the theorem of unique representation in terms of elementary 
symmetric polynomials also holds true. In other words, the follow- 
ing theorem is valid. 

The combined system 

Gj, 02, . . .. On, Tj. Tj, . • 


of elementary symmetric polynomials in the sfiven systems of unknowns 

x„ and y„ y, !/. « algebraically independent over 

the field P. . , • 1 

Indeed, suppose over the field P there is a polynomial 


(p (Oj, 02, • • •» Ons Ti, T 2 , • • ff) 

equal to zero although not all its cooflicienls are zero-s. This pol>no- 
mial may be regarded as a polynomial \|; (t,. Tr) with 

coefficients which are polynomials in Oj. Oo o„. \\o can, con- 

sequently, take it that tj) is a polynomial in xj, Xj, . . .. x^ over the 

field of rational fractions 


Q = P (xi, x., 


♦ 1 


•Tn) 


The system m, y., . . Vr remains algebraically independent over 
the field 6: if, in this system, there were algebraic dependence with 
coefficients from llien, by cliininaling tbo dcMiominalors, nw would 
obtain an algebraic dependence in sy^'tem ( 4 ), which contradicts the 
assumption. Proceeding from I lie unicjucMiess thooroui of the prcco- 

ding section, we now find that the system X|, Xj must also 

be algebraically independent over llie field Q, and therefore all coeffi- 
cients of the polynomial \\) arc equal to zero. However, these coeffi- 
cients are polynomials in Oj, O;, . . .. (?« and therefore, again on the 
basis of the uniqueness theorem for the case of one system of unknowns 
(this time, the system X\, x.,, . . •• x„), all coefficients of these latter 
polynomials are themselves zero. Ibis proves that, in contradiction 
with the hypothesis, all coefficients of the polynomial (p must 

be zero. 



326 


CH. 11. POLYNOMIALS IN SEVERAL UNKNOWNS 


54. Resultant. Elimination of Unknown. 

Discriminant 

If we have a polynomial / (xi, Xo, . . Xn) from the ring 
P [arj, .Tj, . . .. a-n 1, then its so/«//on is a set of values of the unknowns 

Xi — CCj, Xo — ttn’ * ♦ 

taken in the field P or in some extension P of this field, a set that 
makes the polynomial / vanish: 

/ (cei, tto, . . ., ttn) = 0 

Every polynomial f of degree greater than zero has solutions: if the 
unknown xj occurs in the notation of this polynomial, then for a, 2 , • • . 

. . a„ we can actually take any elements of the field P, provided 
only that the degree of the polynomial / (xi, ag, . . ., a„) is strictly 
positive, and then, using the theorem on the existence of a root 

(Sec. 40), lake an extension P of the field P in which the polynomial 
/ (xj, a.,, . . a„) in the single unknown Xi has the root ai. At the 

same time, we see that the property of a polynomial of degree n 
in one unknown to have, in any field, not more than n roots ceases 
to hold true for polynomials in several unknowns. 

If we have several polynomials in n unknowns, we can pose the 
question of finding solutions that are common to all these polyno- 
mials; that is, solutions of the system of equations which is obtained 
by equating the given polynomials to zero. A particular case of this 
problem, namely the ca.‘<e of systems of linear equations, was consi- 
dered in detail in Chapter 2. However, concerning the opposite case 
of one equation in one unknown but of arbitrary degree, we know 
nothing about the roots except that they exist in some extension of 
the base field. Finding and studying solutions of an arbitrary non- 
linear sy.stem of equations in several unknowns is, quite understan- 
dably. a still more involved problem that goes beyond the scope of 
our present cour.<e and constitutes a special branch of mathematics 
known as algebraic geometry. Here, we confine ourselves to a system 
of two equations of arbitrary degree in two unknowns; we wilfshow 
(hal this case can be reduced to that of one equation in one unknown. 

Let us first take up the question of the existence of common 
roots of two polynomials in one unknown. Suppose we have the poly- 
nomials 

/ (x) = Qox" + 1 

g (x) = box" bix'-^ . . . -f- i,_,x + 63 J ^ ^ 

over the field P, Go = 7 ^ 0. to 0. 

From the results of the preceding chapter, it readily follows that 
polynomials f (x) and g (x) have a common root in some extension of the 
field P if and only if they are not relatively prime. Thus, the question 
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of the existence of common roots of the given polynomials can be 
resolved by applying the Euclidean algorithm. 

We will now give another method. Let P be some extension of 
the held P in which / (x) has n roots cti, . . .. and g (x) has s 
roots Pi, P 2 . . . for T we can lake the splitting field for the 

product / (x) g (x). The element 

^(/.g) = flA" 11 fl (2) 

1 

of the field P is called the resultant oi the polynomials / and g (x). 
It is obvious that / (x) and g (x) have a common root in P if and only 
*/ ^ g) = 0- Since 

g ( r) = ^>0 1 1 

and therefore 

g(«i)=^^o 11 («i — M 

j- 1 

it follows that the resultant R (/, g) can also be written as 


t ^- 1 

The polynomials / (x) and g (x) arc utilized in nonsymiuetric 
fashion in determining the re.^ullant. Indeed, 

«(g. /) = ''Xll fl = g) <■'*) 

j-l i 

In accordance with (3), R {g, /) may be written as 


li (g. n=b':\ \ I (P>) 

r^i 


(h) 


Expre.ssion (2) for a re.sultant requires a knowledge of the roots 
of the polynomials / (x) and g (x) and therefore i.s, in a i)raclical sen.^e, 
useless for solving the problem of the existence of a common root of 
these two polynomials. However, it turns out that the resultant 
R (/, g) may be represented in the form of a polynomial in the cocffi- 
cientsao,au . • a^. bo, h, of the polynomials f (x) and g {x). 

The possibility of such a representation follow.s readily from the 
results of the preceding section. Indeed, formula (2) shows that the 
resultant R (/, g) is a symmetric polynomial in two sets of unknowns: 

the set a,, a„ and the set p,, Pj p«. Therefore, as 

proved at the end of the preceding section, it can be represented in 
the form of a polynomial in the elementary .symmetric polynomials 
with respect to these two sy.steins of unknowns, that is, by the Vieta 
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formulas, as a polynomial in the quotients i — 1, 2, .... n, 
b- 

and = 1 , 2 , . . s; the factor included in ( 2 ) eliminates 

flo and 6 o from the denominators of the resulting expression. Inciden- 
tally, it would be an arduous task to find the expression of the resul- 
tant in terms of the coefficients by means of methods described in the 
preceding sections, and so we will proceed differently. 

The expression for the resultant of the polynomials (1) that we 
will find will suit any pair of such polynomials. To be more precise, 
we will take it that the set of roots 


®1> ^21 * • •> Pli P 2 ’ ■ • ‘t Pa (6) 

of the polynomials ( 1 ) is a set of n s independent unknowns, that 
is. a set of « ->- s elements which are algebraically independent over 
the field P in the sense of Sec. 51. 

We will get an expression for the re.^ultant. which expression, 
regarded as a polynomial in the unknowns ( 6 ) (after replacement of 
the coefficients by the roots via the Vieta formulas), will be equal 
to the right member of ( 2 ); this member is also regarded as a polyno- 
mial in the unknowns (b). 

Regarding the equality precisely in the sense of an identity 
in the set of unknowns (t>). we will prove that the resultant R (/, g) of 
the polynomiols (1) is equal to the following determinant of order n -f 5 : 



«0 

• 

On 


flo a, 

♦ • • (In 


"0 

flj • . ♦ (In 

^0 

h^ . . 

. bs 


bo 6, 

• • • bg 

• 

« • • • 

bo 

• « • • • 

, 65 


s rows 


n rows 



(all vacancies are occu 
natil is clear enough; 
appears a' times on the 
n times. 


pied by zeros). The structure of this determi- 
it need only !)e noted tliat the coefficient Qq 
principal diagonal and the coefficient bg occurs 


, 1 . our assertion, we compute in two ways the product 

where M is the auxiliary delerm inant of order n -f- s 


on-j-s- 1 

Pi 


qTI ^5- 1 

• • - Ps 

a", *■-* 

fs-l 

CC'2 

1 

* « • 

pn+,-2 

^ A ^ A 

on+s-2 

P2 



n 

a • 

(V 

n+«-2 

• * • 

* 9 ▼ # 

Pi 

• • « • 

P? 

...Pl 

■ » « • « 

• • • • 

«> 

a; 

... 

P 2 

• • • P 3 

a, 

^2 

• . • Cifi 

1 

1 

I 

1 

1 ” 

. . . 1 
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M is the Vandermonde determinant and so it is equal (see Sec. 6) 
to the product of the differences of the elements of its second last row, 
any succeeding element being subtracted from any preceding ele- 
ment. Thus, 


n 


n (Pi-M- n n n (ai-«/) 

i=li=l 


and therefore, by (4), 


alb'^DM r= Z) ■ /? (g, /) • 


n (Pi-M* n (“i-a;) 



On the other hand, let us compute the product DM on the basis 
of the theorem on the determinant of a product of matrices. Multi- 
plying out the appropriate matrices and taking into account that all 
a are roots of / (x) and all p are roots of g (x), we get 



p'rV(P,) pr 

■'/(P 2 ) 

...prV(P,) 

0 

0 

0 


pr'/(P,)P2 

■'/(Pa) 

...prV{P.) 

0 

• * « • 

0 

0 

1 

p./(P.) p^, 

^(P=) 

. . . P J (Ps) 

0 

0 

0 


/(Pi) /(P 2 ) 

... /(p*) 

0 

0 

0 


0 

0 

... 0 

arV(« 

i) 0 . 2 ' ^g {a.) 

...aT^g (a«) 


0 

0 

... 0 

(a 

• • • » 

,)arV(«2) 

. . . a„ g (an) 


0 

» » • • 

0 

0 


(“ 2 ) 

■ . . ang(an) 

1 

0 

0 

.. . (1 



... g (ctn) 


Applying the Laplace theorem, then taking common factors out of 
the columns of the determinants and computing the remaining deter- 
minants as Vandermonde determinants, we obtain 


n 


a\f,-DM = a‘y„ [] / (Pj) 


n (Pi-p^)- II t'M 


[1 (cti-ctj) 


or, using (3) and (5), 

= g)!i(g, . 


(P.-Pj)' II (a(-aj) 



We find that the right side.s of (8) and (9), considered as polyno- 
mials in the unknowns (6). are equal. Both sides of the resulting equa- 
tion can be reduced by common factors not identically zero. The 
common factor R (g. /) is not equal to zero: since flo ^ 0 and 6o =5^ 0 
by hypothesis, it suffices to .select for the unknowns (6) nonequal valu- 
es (in the base field or in some extension of it) in order to obtain from 
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•(4) a nonzero value for tlie polynomial R (g, f). In the same way, we 
prove that the other two common factors are also different from zero. 
Cancelling out common factors, we arrive at the equality 

R{i,g)=D (10) 


which is what we set out to prove. 

Let us now give up the requirement that the leading coefficients 
of the polynomials (1) be different from zero*. Concerning the true 
degrees of these polynomials, it is thus possible to assert only that 
they do not exceed their “formal” degrees n and. respectively, s. For 
the resultant, the expression (2) is now meaningle.ss, since it may be 
that tlie polynomials in question have fewer roots than n or s. On the 
other hand, determinant (7) can be written now as well, and since it 
is already proved that for oq 0, 6o ^ 0 this determinant is equal 
to the resultant, it follows that in our general case too we can call 
it tlie resultant of the polynomials / (.r) and g (x) and denote it by 

R (/. ff)- 

However, we can no longer hope that the fact that the resultant 
is zero is equivalent to our polynomials having a root in common. 
Indeed, if = 0 and bo = 0. then R (/, g) = 0. irrespective of 
whether the jiolynoinials / and g have common roots or not. It turns 
out. howj'ver. that thi.sca.se i.s the only case when one cannot conclude 
that if the resultant is zero, the given polynomials have common 
roots**. iS'amely. the following theorem is valid. 

If we hare polijnomUils (I) with arbitrary leading coefficients, then 
the resultant (7) of these polynomials is zero if and only if the polyno- 
mials have a common root or if their leading coefficients are both zero. 

Proof. I he case of u,, ^ 0. bo 0 has already been considered, 
and the case of — bo — 0 is covered in tlie statement of the theo- 
nun. It remains to consider tlu' case when one of the leading coeffi- 
cients of the p(dynoTniaIs (1). say Oo. is nonzero and bo is equal to zero. 

If /j, — 0 hir all /. / ^ 0. 1 s. tlien R (/, ^) = 0 since the 

determinant (/) contains zero rows. In tliis ca.«e. however, tlie poly- 
nomial identically zero and therefore has common roots 

with / (j). However, if 

hi) - ■ ... - - = 0, but 6ft = 5 ^ 0 , k s 

and if 


g (.r) b,,.T-^ - 6ft — 


6._i.r 


* This toinporary rojpclioti of ihc condilion on the loading coofficicut 
of the polynomial, wliicli was valid iin to now, is due to subsoqnont applica* 
tions: wo want to consider systems of polynomials in two unknowns and we 
want to regard one of the unknowns as a coefficioul. Thus, llu> loading coeffici- 
ent can vanish for particular values of this unknown. 

*=* The determinant (7) is of course also equal to zero when — bg — 0. 
However, in this case the polynomials (t) have a common root 0. 
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then, replacing the elements bo, 6|, .... in (7) with zeros and 
applying the Laplace theorem, we obviously get 

R{f,g) =oiR{f,g) ( 11 ) 


Butsincetheleadingcoefficienl.sof both polynomials/ and g are diffe- 
rent from zero, it follows, from what was proved above, that the 
equality i?(/, J) = 0 is neces.'jary and sufficient for the polynomials 
/ and g to have a root in comnion^On the other hand, by (11), the 
equalities R (/. g) = 0 and R if, g) = 0 are equivalent, and since 

the polynomials g and g of course have the same roots, we find that 
in the case at hand as well the fact that the resultant R (/, g) is zero 
is equivalent to the polynomials / (x) and g (x) having a common 
root. This proves the theorem. 

Let us find the resultant of the two quadratic polynomials 

/ (x) = QqX- -i- -r rt., g (x) = box- + ijx + b. 


By (7), 


R (/. g) = 


a I a., 0 

0 oq ^ [ Go 

bg /*! b^ 0 
0 bo hi bi 


or, computing the determinant via expansion by the first and third 
rows, 

R (/, g) = (agb. - a,, bo)- - {ogbi — rt,bo) ~ (12) 

Thus, if we have the jiolynoniials 

/ (x) = x^ - Ox -i- 2. g (x) = x’^ + X + 5 

then, by (12), R (/, g) = 23.3 and so tlie.«o polynomials do not have 
i>ny roots in conuiion. Ilut if we have tlie jiolynomials 

/ (x) = x'-^ — 4x — T). g (x) = x= — 7x + 10 
then R (/. g) = 0, wliich means that tln-y liave a common root, the 

number 5.J 

Eliminating an unknown from a system of two equations in two 
Unknowns. Suppose we have two polynomials / and g in t\so unknowns 
xand y with coeflicient.s from some field R. We write the polynomials 
in descending powers of x: 

/ (x, y) = ffo (y) 'i «i (y) ■' ■ • • + (i/) •*' + (y)’ 1 

g (x, y) = bo iy) x‘ ^ hi iy) x'”' ■ • ■ + iu) x -f (y) J 

(13) 
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The coefficients will be polynomials from the ring P [i/]. We find the 
resultant of / and g, which are regarded as polynomials in x, and deno- 
te it by Rx (A g)- By (7) it will be a polynomial in the single unknown 
y with coefficients from the field P: 


Rx iJyg)=F iy) 


(14) 


Let the system of polynomials (13) have, in some extension of 
the field P, the common solution x = a, y = p- Substituting the 
value p in place of y in (13), we get two polynomials / {x, p) and 
g (x, P) in the one unknown x. These polynomials have the common 
root a and therefore their resultant, which by (14) is equal to 
F (p). must be equal to zero, that is. p must be a root of the resultant 
Rxif^g)- Conversely, if the resultant P* (/, g) of the polynomials 
(13) has the root p, then the resultant of the polynomials / (x, p) 
and g (x, P) is zero. That is to say, either these polynomials have a 
common root or both their leading coefficients are zero, 

(P) = io (P) = 0 


The finding of common solutions of the system (13) of polynomials 
is reduced to the finding of roots of the single polynomial (14) in the 
single unknown y. We say that the unknown x has been eliminated 
from the system (13) of polynomials. 

The next theorem relates to the question of the degree of the poly- 
nomial which we obtain after eliminating one unknown from the 
system of two polynomials in two unknowns. 

If. taking the unknowns together, the polynomials f (x, y) and 
g (x, y) are respectively of degree.^ n and s. then the degree of the poly- 
nomial Pf (/, g) in the unknown y does not exceed the product ns, if, of 
course, this polynomial is not identically zero. 

First of all. if we regard two polynomials in one unknown with 
leading coefficients equal to unity, then, by (2), their resultant 

P if. g) is a homogeneous polynomial in cti. a., a„, pj, p 2 , • • • 

. . .. pa of degree ns. From thi.s it follows that if the term 


ax a 


h 



(l'n%'b'l 



enters into the expression of the resultant via the coefficients 

flj’ • • •> ^ 1 - b», .... 6s and if the weight of this term is the 

number 

“H 2/io T . . . -r nkfi "i" / 1 2/., slg 

then all terms of P (/, g) expressed via the coefficients have the same 
weight equal to ns. This assertion also holds true in tlie general case 
for terms of the resultant (7) if the number 

“F l*/i'j + • • • “T /iA'n “F 0 -/a -f” 1 '/i + . . , “F s/f (15) 
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is given as the weight of the term . . . bi». 

Indeed, replacing the factors and bo hy unity in the terms of deter- 
minant (7), we arrive at the case that has already been considered; 
however, the exponents on these factors enter into (15) with coeffi- 
cients 0. 

Now write the polynomials / and g as follows: 

/ (i. y) = ^0 (y) I” + + - • • + (y). 

i y) = (y) (yi + • • • + 6s ({/) 


Since n is the degree of / (x. y) in the unknowns jointly, the power 

of the coefficient a, (?/). r-=0. 1. 2 /?, cannot exceed its index 

r; this holds true for br (y) as well. Whence it follows that the degree 
of each term of the resultant (/, g) does not exceed the weight of 
• this term, which is to say it is not greater than the number ns. This 
completes the proof. 


Example 1. Find the common solutions to the following system of polyno- 
mials: 

f (i, y) = x'y -r Sxy -f- 2y -|- 3, 
g (i, y) = 2jy — 2 j -r 2i/ -t- 3 

Eliminate x from this system: to do this, rewTite it as 

/ (ar. y) = y-r- + (3y) x -f- (2y -f 3), 
g (i. y) = (2y — 2) X -r (2y + 3) 


} 


(lt>) 


then 


Bx (/. g) = 


y 3y 2y - 3 
2y - 2 2y -r 3 0 

0 2y - 2 2y 4- 3 


= 2,/-+ lly-f 12 


Q 

The numbers 6j = —4. p- — — r- "'ill ho the roots of the resultant. Tlie leading 

coefficients of the polynomials (16) do not vanish for thc.sp values of the unknown 
y, and so each of them, together with some value for x, constitutes a solution 
•of the given system of polynomials. The polynomials 

/ (x, —4) = — 4x= — 12i — 5, 

g (x, —4) = — lOx — 5 


have the common root ai = — 



The polynomial.s 





have the common root aa = 0. Thus, the given system of polynomials has two 
elutions: 


1 

«! = — y » 


Pi=— 4 and 02 = 0, 
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Example 2. Eliminate one unknown from the system of polynomials 

f (x, y) = 2x®y — + x + 5, 

S (^. y) = + 2xy= — 5i/ + 1 

Since both polynomials are of degree 2 in the unknown y, whereas one of 
them is of degree 3 in x, it is advisable to eliminate y. Rewrite the system as 

/ (x, y) = i—x)‘y- + (2x®).(/ -f + 5), 1 

g (x> y) = + 2x) _ 5y + 1 / 

and find its resultant, applying formula (12): 

^E/ (/. g) = U— -1 — (i + 5) (x= + 2x)l- 

_ [(-x) (-5) - 2x3 (^2 2x)] [2x3 -1 - (x + 5) (-5)] 

= 4x« + 8x" + 11x6 _j_ 84x6 -L 161x4 + 154x3 + 96x2 _ i25x 

One of the roots of the resultant is 0. However, for this value of the unknown 
X, both leading coefficients of the polynomials (17) vanish; and, as is readily 
seen, the polynomials / (0, y) and g (0, y) do not have any common roots. We do 
not have any method for finding the other roots of the resultant. We can only 
assert that if we found them [say in the splitting field for fiy (/, g)], then not 
one of them would make both leading coefficients of the polynomials (17) va- 
nish, and therefore each of these roots, together with some value for y (one 
or even several), would constitute a solution of the given system of polyno- 
mials. 

Tliere are also methods for successively eliminating the unknowns 
from systems with an arbitrary number of polynomials and unknowns. 
They are too involved however to be included in this course. 

Discriminant. By analogy with the question that led us to the 
concept of a resultant, we can ask about the conditions under which 
a polynomial / (j) of degree n from the ring P [j:] has multiple roots. 
Let 

/ (J*) = OqX'^ -f -r . . . + 00^=0 

and supj)o.‘?e that in some extension of the field P this polynomial 

has the roots a,, a., a^. It is obvious that there will be equal 

roots among them if and only if the following product is zero: 

A = (CC2 — ai) {ct3 — a,) . . . (a„ — aj) («3 — a2) (a4— cto) • • . (a„ — ao) 


X(an— an-i)= n (cXi — ai) 
or, equivalently, if the product 

D = n («i - 

called the discriminant of the polynomial f (j) is zero. 

Unlike the product A, which can change sign upon a rearrange- 
ment of the roots, the discriminant D is symmetric with respect to 
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On and can therefore be expressed in terms of the coeffi- 
cients of the polynomial / (x). To find this expression, under the assu- 
mption that the field P has characteristic zero, we can take advanta- 
ge of the connection between the discriminant of the polynomial 
/ (a:) and the resultant of this polynomial and its derivative. Il is 
natural to expect such a connection: we know’ from Sec. 49 that a po- 
lynomial has multiple roots if and only if it has roots in common with 
the derivative f (x) and therefore Z) = 0 if and only if H (/. /') = 0. 
By formula (3) of this section, 

RU, = /'(a-) 

Differentiating 

n 

h^l 

we get 

f'{x)=^ao y 11 (ar — «;■) 

h=i jVA 

After substitution of a,- instead of x, all terms, except the flh, vanish 
and so 

II — 

whence 


ti 


(/,/') = <-■•< II II (a,-c.;) 

»= I }=Fi 

For any i and /, i >• ;. tw'o factors enter Into this product: a, — aj 
and aj ~~ a,. Their product is equal to (— l) (a, — aj)^ anil .«ince 

there are— ** pairs of indices i, j satisfying the inequalities n ^ 

1. it follows that 

n(n-l) >i(7i-n 

«(/,/') = (-!) 2 IJ (a,-a;)= = (-l) a„D 

Example. Find the discriminant of (he quadratic (rinomial 

/ (x) = flx* -\- bx c 

Since /' (x) = 2flx -}- 6, i( follows that 


/?(/. /') = 

in our case, ^ I and 

2 


2a 6 0 
0 2a 


= a ( — b‘ Aac) 


so 


D = —a-*/? (/. /') = 62 - Aac 

This coincides with wiiat school algebra calls the discriminant of a 
equation. 


quadratic 
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Another way of finding the discriminant is the following. Form 
a Vandermonde determinant from the powers of the roots ai, . . . 
. . On- A& indicated in Sec. 6, 


1 

1 

... 1 


as 

* • • Ot;! 

cc? 

CC^ 

... 

• • 

• ♦ » 

n -1 

• . • CAn 


[] (a/— a;):=A 


and so the discriminant is equal to the square of this determinant 
multiplied by Multiplying this determinant by its transpose 

by the rule for matrix multiplication and recalling the power sums 
defined in the preceding section, we get 



n 

Si 

S2 

• • • 


Si 

So 

S3 

. • ♦ Sft 

D = 0="'== 

$2 

S3 

S; 

• • • + 1 

1 

# 4 

S/i-l 

9 

Sn 

+1 

# # • 4 

• • • >^271—2 



where is the sum of the A*th powers of the roots Oj, a 2 , . . cfn. 



Example. Find the discriminant of the cubic polynomial f (x) = x® -f 
-- bx -h c. By (18) 



3 5) ^ 

51 52 ^ 

52 53 S4 


As we know from the preceding section, 

S, = Oj = — o, 

$2 = oj — 2 o2 = a- — 26, 

53 = of — SOiO; + 303 = — 0 ® + 3o6 — 3c 


Using Newton’s formula, we will also find that (because a* = 0) 
s^ = Oj — 4 oj 02 -f” 4<T}03 “t 20® = — 4a®6 -j- 4flc -f- 26® 


Whence 


1) = Zs2Si -\- 25j 5253 — 53 — sfs* — S.TJ 

= a26® — 46^ — 4a^c + 18a6c — 27c® 



In particular, for a = 0, i.e.. 


for an incomplete cubic polynomial, we obtain 


D = —46^ — 27c® 


in complete accordance with what was said in Sec. 38. 
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55. Alternative Proof of the Fundamental Theorem 

of the Algebra of Complex Numbers 

The proof of the fundamental theorem given in Sec. 23 was comple- 
tely nonalgebraic. Here we give another proof, which takes advanta- 
ge of an e.xtensive algebraic apparatus: essential use is made of the 
fundamental theorem on symmetric polynomials (Sec. 52) and also 
of the theorem on the exi.stence of a splitting field for any polynomial 
(Sec. 49). At the same time, the nonalgebraic portion of the proof 
is minimal and is reduced to a single simple assertion. 

First note that in Sec. 23 we proved a lemma on the modulus of 
the highest-degree term of a polynomial. Taking the coefficients of 
a polynomial / (x) to be real and putting k = 1, we obtain the follo- 
wing corollary of this lemma. 

For real values of x sufficiently large in absolute value the sign of 
a polynomial f (x) with real coefficients coincides with the sign of the 
highest-degree term. 

From this follows the result that 

A polynomial of odd degree with real coefficients has at least one 
real root. 

Indeed, let 

/ (x) = aox^ + fljx""* -f ■ ■ + ^^71 

and all coefficients be real. Because of the oddnes.s of n, the highe.st- 
degree term a^x’^ has different signs for positive and negative values 
of X, and therefore, as was proved above, the polynomial / (x) 
will also have different signs for positive and negative values of x 
sufficiently large in ab.so!ute value. There consequently e.xist real 
values of x, say a and 6, sucli that 

f{a)cO, f{b)>0 

However, from the course of analysis we know that a polynomial 
(a rational integral function, that is) / (x) is a continuous function 
and for this reason, becau.se of one of the basic properties of conlimi- 
ou.s functions, / (x) takes on any given value intermediate between 
f (a) and / (6) for certain real value.s of x between a and b. For 
example, there is an a between a and h such that / (a) = 0. 

Using this result, we will prove the following assertion. 

Every polynomial of arbitrary degree with real coefficients has at 
least one complex root. 

Indeed, suppose we have a polynomial / (x) with real coefficients 
having degree n = 2^'q, where q is an odd number. Since the case 

* = 0 has already been considered (.see above), we .shall assume 

* > 0, that is, we consider n an even number and wo will argue by 
induction with respect to k, on the assumption that our assertion has 

22-986 
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been proved for all polynomials with real coefficients whose degrees 
are divisible by 2''"^ but not divisible by 2^ *. 

Let P be a splitting field for the polynomial / (j) over the field 
of complex numbers (see Sec. 49), and let ai, ag, be the 

roots of / (x) in P. Choose an arbitrary real number c and take the 
elements of the field P having the form 

Pa = + c (oii + ay), i < 7 (1) 

The number of elements is obviously equal to 

= 2''-i9(2V- 1) = (2) 

2 3 

where q* is an odd number. 

Let us now construct from the ring P [-] a polynomial g {x) 
having for its rool.s all the elements p^^ and only these elements: 

g{x)= 1] (X— Pi^) 

». j. *<J 

The coefficients of this polynomial are elementary symmetric poly- 
nomials in pj. Consequently, by (1), they will be polynomials 

in cti, a., ct;, with real coefficients (since the number c is real), 

tliey will even be symmetric polynomials. True enough, a transposi- 
tion of any two cc, say and a/, implies merely a rearrangement in 
the set of all pa^ every P/,/, where ; is different from k and from /, 
is converted into Pj;-. and conversely, whereas p^/ and all Pa- for i 
and / different from k and /. remain fixed. But the coefficients of 
the polynomial g (j) remain unchanged under a rearrangement of its 
rool.s. 

From 1hi.« it follows, by the fundamental theorem on symmetric 
polynomial.'^, lliat the coefficients of the polynomial g (x) will he poly- 
Moinial.‘< (with real coefficients) in the coefficients of the given poly- 
nomial / (x) and for tliis reason will themselves bo real numbers. 
The degree of this ]iolynomial. which is equal to the numher of the 
rnot.'i pj. is divisihle, according to (2). by 2'‘"h but is not divisible 
by 2'‘. And .^o. by the induction liypolliesis, at least one of the roots 
P;^ of the polynomial g (x) nuist be a complex number. 

Thu.', for any clndce of the real number c there is a pair of 
indices, /. I < / ^ 1 ^ 7 ^ n. such tliat the element + 

-h c (ttj + a;) is a complex nunil)er (recall that the field P contains 
the field of complex numher.s as a .subfield). Quite naturally, for any 
other choice of the numhi'r c there will, generally speaking, corre- 
spond to it (in the indicated sense) another i*air of indices. However, 
there exist an infinitude of distinct real numbers c. whereas we have 
at our disposal only a finite number of di.stincl pairs (, /. Whence it 


• Consequently, this degree can even bo greater than n. 
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follows that we can choose two distinct real numbers Ci and Co, Cj ^ 
^ Czf such that they are associated with one and the same pair of 
indices i, /, for which 

-f C| (ttf + ct;) = a, 
ctiCtj C 2 (at -h ccj) = b 

are complex numbers. 

From equality (3) it follows that 

{ci — Cj) (cii + aj) = a ~ b 

whence 

, a—b 

C^i + 0.j = 

. ^ C,-C2 

That is to say, this sum is a complex number. From this and at least 
from the first of the equalities (3) it follows that the product 
will also be a complex number. Thus, the elements and aj are 
roots of the quadratic equation 

— (c£| + ctj) X + ataj ~ 0, 

with complex coefficients and therefore, as follows from the formula 
(derived in Sec. 38) for solving quadratic equations with complex 
coefficients, they must themselves he coiiijilcx numbers. Thus, 
among the roots of the polynomial / (x) we have even found two 
complex roots and the proof of our a.sscrtion is complete. 

For complete proof of the fundamental theorem, we have yet 
to consider the case of a polynomial with arbitrary complex coeffi- 
cients. Let 

/ (x) = flox” 4- fliX""^ + . . . + 
be such a polynomial. Take the polynomial 

7 (x) = flox" + aix"“‘ + 

obtained from / (x) by rejilacing all coefficients with conjugate 
complex numbers and then consider the product 

=f (x) 7 {^) - -'+■■■ + + . . . -i-b.^ 

where, evidently, 

bh~ ^ OiOj, /.' = 0, 1, 2, . . , 2/t 

Using the familiar properties of conjugate complex numbers (see 
Sec. 18), we find that 

bk= S = 

i+}=k 

That is, all coefficients of the polynomial F (x) prove to bo real. 
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It then follows, as proved above, that the polynomial F (a:) has 
at least one complex root p, 

^ (P) = / (P) T (P) = 0 

That is, either / (P) = 0 or / (P) 0. In the former case, the theorem 

is proved. But if the latter case occurs, that is, 

flop” + «iP"'^ + . . . + = 0 

then, replacing all complex numbers here by their conjugates 
(which, as we know, does not affect the equality), we get 

/ (p) = floP" + aiP"-' 4- . - . + = 0 

Thus, / (x) has the complex number p for its root. This completes 
the proof of the fundamental theorem. 



CHAPTER 12 


POLYNOMIALS 
WITH RATIONAL 
COEFFICIENTS 


56. Reducibility of Polynomials over the Field 
of Raiionals 

The field of rational mimhors. /?, is (he third number field of 
particular interest to us. along with the Helds of real and complex 
numbers. It is the smallest of the number fields: as proved in Sec. 43, 
the field H is contained in its entirety in any number field. We will 
now investigate the (juestion of tin* reducibility of polynomials o\er 
the field of raiionals, in the next .section we deal with the rational 
(integral and fractional) roots of polynomials with rational coeffi- 
cients. We stress once again that these are two different tilings, the 
polynomial 


..4 


+ 2 x 2 1 = (. 1-2 + 1)2 


is reducible over the field of rational numbers, though it does not have 
a single rational root. 

What can be said about the reducibility of polynomials over the 
field /?? First of all. note that if we have a polynomial / (x) wbo.se 
coefficients are rational but are not all integral, then, reducing the 
Coefficients to a common denominator and multiplying / (x) by this 
denominator (equal, say, to k), we obtain a polynomial kf (x). all 
the coefficients of which will now be integers. It is evident that the 
polynomials / (x) and kf (x) have the same roots; on the other hand, 
they will at the same time be reducible or irreducible over the 
field R. 

However, wo ore not yet orililleil to conTinc oursolvos to fl coiisi* 
deration of polynomials with integral coefficients. Indeed, let the 
integral polynomial g (x) (i.e., a jiolynomial with integral coeffici- 
ents) be reducible over the field of raiionals, i.e., factorable into 
lower-degree factors with rational (in the general case, fractional) 
coefficients. Does faclorability of g (.r) into factors with integral 
coefficients follow from this? In other words, might it not bo true 
that a polynomial with integral coefficients that is reducible over 
the field of rational numbers is irreducible over the ring of integers? 



342 


CH. 12. POLYNOMIALS WITH RATIONAL COEFFICIENTS 


The answers may be obtained via considerations similar to those 
carried out in Sec. 51. Lot us call a polynomial / (x) with integral 
coefficients primitive if its coefficients are jointly relatively prime, 
that is, if they do not have any common divisors different from 1 and 
— 1. If we have an arbitrary polynomial q) (x) with rational coeffi- 
cients, it may be uniquely represented in the form of a product of 
a lowest-terms fraction by some primitive polynomial: 

<PW = y/W (1) 

To do this, factor out the common denominator of all coefficients of 
the polynomial ip (x) and then also the common factors of the nume- 
rators of these coefficients; note that the degree of / (x) is the same 
as that of (p (x). The uniqueness (to within sign) of the representation 
(1) is proved as follows. Let 


where g (x) is again a primitive polynomial. Then 

adj (x) = beg (x) 


Thus, ad and be are obtained by taking all the common factors 
out of the coefficients of one and the same integral polynomial, 
and therefore they can differ in sign alone. Whence it follows that 
the primitive polynomials / (x) and g (x) can likewise differ only 
in sign. 

The Gaus.‘=ian lemma holds true for integral primitive polyno- 
mial.'?. 

The product of two integral primitive polynomials is a primitive 
polynomial. 

Indeed, .‘:uppo.'?e wo liave the integral primitive polynomials 

/ U) = a„/ - <7,y-i ~ . . . + + . . . + 

!> (x) = b^x' -L b,x'-' ... + bjx'-> + . . . + 6, 

and let 


/ W 8 (4 = V'*' + ■ -f + . . . + 

If this Iirodiicl i? not i>riniilivo. tlion tlioro is a prime p such that 

serves as a common divisor of all roofncicnls Since 

all the copfhcionls of tlm primitive polynomial f (j) cannot be divi- 
sible by p, let llie roeflicient a: be the first one not divisible by p. 
Simi arly deiiote by b, the first coefficient of the polynomial g (x) 
not divisible by p. Multiplying / (x) and g (x) termwise and colle- 
cting terms in ^ve obtain 


Ci+i ~ dibj + Oi-ibj+i + ai.^bj+.. 
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The left side is divisible by p. Also, all terms on the right are cer- 
tainly divisible by p, except the first. Indeed, by the conditions impo- 
sed on the choice of i and all the coefficients a, ,, a,-,. . . and 
also 6 ,.„ b,.,, ... are divisible by p. It then follows that the pro- 
duct tt:b, is also divisible by p and therefore, due to the pnmality 
of the number p, p should divide at least one of the coefficients 
bp which, however, is not the case. The lemma is proved. 

Let us now answer the questions posed above. Let a polynomial 
g (x) of degree n with integral coefficients be reducible over the field 

of rational numbers: 

g (x) = (pi (x) (Pa (a:) 

where ip, (x) and cp^ (x) are polynomials with rational coefficients 
and of degree less than n. Then 

where is in lowest terms and /, (x) is a primitive polynomial. 
Then 


The left member is an integral polynomial and so the denominator 
bib, in the right member must be reducible. But the polynomial in 
brackets will, by the Gaus.sian lemma, be primitive, and so any 
prime factor from bib. can cancel out only with some prime factor 
from a.ao, and since a, and 6, are relatively prime, i = 1, 2, the 
number a, must be exactly divisible by and by b., 

= biQ^, Ot = b2(ii 

Whence . . . 

g (x) = 

Adjoining the coefficient a\a\ to any one of the factors /, (x), /, (x), 
we obtain a factorization of the polynomial g (x) into factors o lower 
degree with integral coefficients. This is the proof of the following 

with integral coefficients that is irreducible over the 
ring of integers will also be irreducible over the field of rational numbers. 

We can now restrict ourselves, in questions relating to the redu- 
cibility of polynomials over the field of rationals, to a consideration 
of factorizations of integral polynomials into factors whose coeffi- 
cients are all likewise integral. , ^ , , 

Wo know that anv polynomial of degree greater than unity is 

reducible over the field of complex numbers, and any polynomial 
(with real coefficients) of degree greater than two is reducible over 
the field of real numbers. The situation regarding the field of ratio- 
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nal numbers is quite difierent: for any n there is a polynomial of degree 
n with rational {even integral) coefficients that is irreducible over the 
field of rational numbers. The proof of this assertion is based on the 
following sufficient criterion of the irreducibility of a polynomial 
over the field R, called the Eisenstein criterion. 

Suppose we have the polynomial 

f (z) = + . . . + 

with integral coefficients. If there is at least one way in which we can 
choose the prime number p that satisfies the following requirements’. 

(1) the leading coefficient oq Is not divisible by p, 

(2) all the other coefficients are divisible by p, 

(3) the constant term is divisible by p but not by p", 

then the polynomial f (z) is irreducible over the field of rational 
numbers. 

Indeed, if the polynomial / (z) is reducible over the field then 
it can bo factored into two factors of lower degree with integral coeffi- 
cients: 

/ (z) = (6o.r'‘ -f- 6iz'‘"^ + . . - H- (coz' + Ciz'”^ + . . • + C/) 

where /c <«,/<«, A: -f Z = n. From this, comparing coefficients 
in both members of the equation, we obtain 

~ bf^Cl, 'V 

= bhCi-i + bh-iCi, I 

a„_2 = bf^Cl-2 "1“ bit^iCi-i -f bfi-2(^i, / (2) 

ffo = boCo ) 

From the first of the equalities (2) it follows that, since a,, is divi- 
sible iiy p and p is prime, one of the factors bh, ci must be divisible 
by p. Both cannot be divisible by p at tlie same time since fln, by 
liypotbesis, i.s not divisible by p~. For instance, let p divide 6^; 
therefore c/ is prime to p. We now go over to the second of the equali- 
ties (2). I I.s left member and also the first term in the right member 
are divisible by p. and so p divides the product as well, but 

since p does not divide o. p does divide Z;;..,. In the same fashion, we 
find from the third equality of (2) that p divides 6^.2, and so on. 
Finally, from the {k -\~ I)tli equality it will be found that p divides 
6o; but thou from the last equality of (2) it follows that p divides ao» 
which contradicts our assumption. 

^1 extremely easy to write, for any n. integral polynomials 
of degree n that satisfy the conditions of the Eisenstein criterion and, 
hence, are iiTeducible over the held of rational numbers. Such, for 

example, is the polynomial z" — 2; the Eisenstein criterion is appli- 
cable for p — 2. 
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The Eisenstein criterion is only a sufficient condition for irredu- 
cibility over the field /?, but by no means is it a necessary condition: 
if it is not possible, for a given polynomial / (x), to find a prime num- 
ber p such that the conditions of the Eisenstein criterion are fulfil- 
led, it may be reducible, like x^ — 5x -f 6, but it may also be irre- 
ducible, like x^ -f- 1. There are a large number of other sufficient 
criteria besides the Eisenstein criterion (though less important) 
for irreducibility of polynomials over the field R. There is also a meth- 
od, due to Kronecker, which permits one to decide whether any 
polynomial with integral coefficients is reducible or not over R. 
However, it is very unwieldy and hardly at all applicable in a 
practical sense. 


Example. Consider the polynomial 

Ip W = ^^‘=xP-‘ + i»-=+...+x+l 


where /> is a prime number. The roots of this polynomial arc plh roots of unity 
different from unity itself; since these roots, together with 1. divide the unit 
circle of the complex plane into p equal parts, the polynomial fj, (x) is called 

a cyclotomir. polynomial. , .i i- j . .i • i • i 

The Eisenstein criterion cannot be directly applied to this polynomial. 

But by changing the variable, setting i = i/ + 1, we get 


g(y) 


= /p + 1) = 1 + P .-' + + 




= i,P-l -i- pyP-2 y p-3 p 


The coefficients of the polynomial g (y) are binomial coefficients and so all, 
except the leading coefficient, are divisible by p\ the constant term is not divi- 
sible by n*. Thus, by the Eisenstein criterion, the polynomial g (j/) is irredu- 
cible over the field It, whence follows the Irreducibility over It of the cyclutomic 
polynomial fp (x). Indeed, if 

fp (a:) ■= <P (■>;) 

then 

g (y) = T (tf + f) (y + 


57. Rational Roots of Integral Polynomials 

It was pointed out above that the question of the factorization 
of a given polynomial over the field of rational numbers into irre- 
ducible factors has no really satisfactory practical solution. Howe- 
ver, the particular case referring to the isolation of linear factors of a 
polynomial with rational coefficients, that is, to the finding of its 
rational roots, is very simple and may be solved williout exces- 
sive computations. Quite naturally, the problem of finding rational 
roots of a polynomial with rational coefficients does not in the least 
exhaust the general problem of the real roots of these polynomials; 
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that is to say, the methods and results given in Chapter 9 are valid 
in toto when applied to polynomials with rational coefficients. 

As we take up the question of finding the rational roots of poly- 
nomials with rational coefficients, it is well to note that, as indicated 
in the preceding section, we can confine ourselves to polynomials 
with integral coefficients. We shall consider separately the case of 

integral and that of fractional roots. 

If an integer a is a root of (i polynomial / (x) with integral coeffi^' 
dents, then a is a divisor of the constant term of the polynomial. 

Indeed, let 

/ (x) = -h + . . . + 

Divide f (x) by x ~ a: 

f (x) = (x — a) (frox”"* + bix'^~- 4- . . . + 

Performing the division by the Horner method (see Sec. 22), 
we iind that all coefficients of the quotient, including are integers, 
and .'^ince 

= —abn.i = a (— 
our assertion is proved.* 

Tims, if an integral polynomial / (x) has integral roots, they will 
be fouiiil among the divi.^ors of the constant term. It is thus necessary 
to lest all po.^siblo divisors (both positive and negative) of the con- 
stant term. IF none is a root of the polynomial, then the polynomial 
has no integral roots at all. 

To lest all tin? divisors of the constant terra may turn out to be 
extremely rom|)licate(l even if the values of the polynomial have 
been ronipiil(‘d by the Horner method and not via direct substitution 
of each of th(' divisors in place of the unknown. The following 
n'inarks somewhat simplify computations. First of all, since both 
1 ami — 1 an* always divisorsof the constant term, we compute / (1) 
ami f ( — 1). rhis [»r('.<i'Uts no difficulties. Now if the integer a is a root 

<d fU), 

f (x) = (x — a) q (x) 


then, as imlicalod above, all the coefticienls of the quotient q (x) 
will be integers, and therefore the quotients 


/(t) 

a- I 



/(-I) 

a- 1 


-^(- 1 ) 


• It would be wrong to attempt to prove this theorem by referring tothe 
fact that the constant term is (to within sign) a product of the roots of lb© 
polynomial /(A* these roots can include fractional, irrational, and complex 
roots, ami one cannot, therefore, assert beforehand that the product of all these 
roots (except a) will be integral. 
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must bo integers. Thus, only suck divisors a of the constant term {from 
among those which differ from 1 and —1) have to be tested, relative to 

which each of the quotients integer. 

Example 1. Find the integral roots of the polynomial 

= 2j: 2— jr — 6 

The numbers ±1, ±2, 3, ±6 are divisors of the constant terra. Since 
/( 1 )= —8, /(—!)= —8, it follows that 1 and —I are not roots. Furthermore, 
the numbers 

_8 _8 -8 -8 
2 + 1’ -2-1’ 6-r -6-1 

are fractions and so the divisors 2, —2. 6, —6 have to be rejected, whereas 
the numbers 

_8 —8 —8 —8 
3^' 3+r -3-r -3+1 

are integers and so the divisors 3 and — 3 have yet to be tested. We apply 
the Horner method: 

1 _2 -1 -6 
— 3 1 —5 14 —48 

That is, /(— 3)= —48 and so —3 is not a root of /(i). Finally. 

1 -2 -1 -C 
3 112 0 

That is. /(3)=0: the number 3 is a root of / (J-). At the same time we found 
the coefheionts of the qiioficnt obtained by dividing / (x) by x — 3: 

/(j-)=.(z-3){xM ->- + 2) 


It is readily seen that the quotient x + 2 docs not have 3 as its root, which 
means that this number is not a multiple root of /(x). 

Example 2. Find integral roots of the polynomial 

/ (x) = Sx'* + x^ — 5z- — 2x + 2 

Here. ±1 and ±2 ore divisors of the constant term. Furthermore /(I) — — 1, 
f (— i) e: 1 , i.e., 1 and —1 do not serve as roots. Finally, since the numbers 



are fractions, it follows that 2 and —2 will not be roots either and so the poly- 
nomial / (x) does not have any integral roots at all. 


Let us now examine the question of fractional roots. 

If an integral polynomial whose leading coefficient is unity has a ra- 
tional root, then this root is an integer. 

Indeed, let the polynomial 

f (x) = -1- + . . . + On 
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with integral coefficients have for a root the fraction — 


in lowest 


terms, i.e., 



From this it follows that 

— = 
c ^ 

Thus the simplified fraction is equal to an integer, which is impos- 
sible. 

To obiain all ike rational {fractional and integral) roots of an integral 
polynomial 

f (x) == OqX'^ + -f anX'^~- -f . . . -f- 

it is necessary to find all the integral roots of the polynomial 

T iy) — */” "f* ~ ~ 


and divide them by Qq. 

Multiply / (x) by and then change the unknown, putting 
y = aox. Clearly, 

T iy) = T = ^rv (^) 

whence it follow.^ that the roots of (he polynomial / (.r) are equal to 
the root-^ of the polynomial <f (//) divided by a^. In particular, to 
rational roots of / (.r) there will correspond rational roots of (f iy)\ 
however, since the leadiiu; coeflicieiit of fp (//) is equal to unity, these 
rcMits can only be inli'gral. and we already have a procedure for 
finding them. 

I’xnmpic. Find rational roots of the polynomial 

.r 2 -:-r,.r -2 

Multiplying / (j) by 3^ ami setting x\e get 

(( (//)- 45.7-54 

We seek intecral ro()ts of the pidynotiiial f{ (7). 

Let us find q' (1) by llie Horner luelhoil: 

I I 5 ;t /,5 - 54 
1 1 1 (> ii 54 U 


Thus, (p(l) = 0, that is, I is a root of (f ( 7 ), and 


t'(.") = a/— l)f/(7) 

where 

q ( 7} 73 -f 672 ^ 9 y 5 4 

Let us find the integral roots of the polynomial q ( 7 ). The numbers ±1, 
i2, i3, i 6 , ±9, ±18, ±27, —54 are divisors of tlie constant term. Here. 


9(1)-70. g(-l) = 50 
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Computing 
except a = 


^ and — ^ for every divisor a we find that all divisors, 

a — 1 a-j- 1 

— 6, must be rejected. Test this divisor: 



I 6 9 54 
10 9 0 


Thus, g (—6) = 0, or —6 is a root of g (y) and therefore also of <p (y). 

Consequently, the polynomial (y) has integral roots 1 and —6. Thus 

the numbers 4- and —2, and only these numbers, are rational roots of the poly- 

3 

Qomial / (x). 


It must be stressed once again that the above-described methods 
are applicable only to polynomials with integral coefficients and 
only for finding their rational roots. 


58. Algebraic Numbers 

Every polynomial of degree n with rational coefficients has n roots 
in the field of complex numbers; some of these roots (or even all of 
them) can lie outside the field of rational numbers. However, not 
every comple.x or real number serves as a root of some polynomial 
with rational coefficients. The complex (or, in particular, real) 
numbers which are roots of sucli polynomials are called algebraic 
numbers in contrast to transcendental numbers. Algebraic numbers 
include ail rational numbers (as the roots of first-degree polynommls 

with rational coefficients) and also any radical of the form X « 
with rational radicand a (as a root of the binomial x” a). On the 
other hand, the more comprehensive courses of mathematical analy- 
sis offer proof of the transcendence of the number e (the ba.‘ie of the 
system of natural logarithms) and also of the familiar number n of 
elementary geometry. 

If a number a is algebraic, then it will even bo a root of some poly- 
nomial with integral coefficients and therefore a root of one of the 
irreducible divisors of this polynomial, also with integral coefficients. 
The irreducible integral polynomial, of which a is a root, is determined 
uniquely to within a constant factor, that is to say. quite uniquely if 
we require that the coefficients of the polynomial be relatively prime 
faintly (i.e., that the polynomial bo primitive). Indeed, if a serves as 
a root of two irreducible polynomials / (x) and g (x), then the greatest 
common divi.sor of these polynomials will be different from unity, 
and therefore the polynomials, due to their irreducibility, can differ 
from one another by a zero-degree factor only. 

Algebraic numbers which are roots of one and the same irreducible 
(over the field R) polynomial are termed conjugate.* Thus, the whole 

• Not to be confused with the concept of the conjugacy of complex num- 
bers. 
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set of algebraic numbers breaks up into disjoint finite classes of con- 
jugate numbers. No rational number, as a root of a first-degree poly- 
nomial. has conjugate numbers different from itself; this property 
is characteristic of rational numbers: every algebraic number which 
is not rational is a root of an irreducible polynomial of degree greater 
than unity, and for this reason it has conjugate numbers different 
from itself. 

The set of all algebraic numbers is a suhfield of the field of complex 
numbers. In other words, the sum, difference, product and quotient of 
algebraic numbers are algebraic numbers. 

In fact, suppose we have the algebraic numbers a and p. Denote 
by a, = a, an. . . ., a„ all numbers conjugate to a. by pi = p, Pn, . . . 

. . ., p, all numbers conjugate to p, by / (ar) and g (x), irreducible 
polynomials with rational coefficients having for roots a and p re- 
spectively. Write a polynomial whose roots arc all possible sums 
etj 4- P;-; this is 

T{-r)= II II [•r-(c!| ^-P;)l 

i*! 

It is ohviou.«: that the coefficients of this polynoiuial will not change 
under rearrangements of all and al.«o of all p;. Hence, on the 
ba.«is of the theorem on jiolynomials .^yminelric with respect to two 
system.^ of unknowns f.<ee end of Sec. nit), tin y arc polynomials in 
tile coefficients of the polynomials / (j) and g fx). In other words, 
the co('fricient.< (d the polynomial ff (.r) prove to be rational numbers, 
and therefore the number a — p — ai -- Pj. which is one of its 
root.«. will he algebraic. 

Thi' algebraic nature of the numbers a — p and ap i.slproved 
in .'Similar fashion with the aid of the polynomials 

I ^ 1 -- 1 

and 


fiX (■'■)= |[ II (•'■ — ^iP;) j 


To prove the algebraic nature of a quotient, it suffices to demon- 
strate that if a number a is algi'hraic and different from zero, then 
a'* will also he an algebraic number. Let a be a root of the poly- 
nomial 

/ (x) -- aox'* ^ ~ , • . - 

with rational coefficients. Then, evidently, the polynomial 


g (x) = a„x^ 


17„_,X 


■t-i 


1 


. . 4- Hjx Uq 
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which also has rational coefficients, will have for a root the number 
which is what we set out to prove. 

It follows, from the theorem just pro_ved, that any sum of a ratio- 
nal number and a radical, say 1 + 2, and also any sum of radi- 
cals, say will he algebraic numbers. However, we 

cannot as yet assert that numbers written as radicals within radicals, 

say -f Kf , are algebraic. This will be a consequence of the 
following theorem. 

If the number o) is a root of the polynomial 

(p (x) = x" + + px"*"* + . . . + Xx + p 


whose coefficients are algebraic numbers, then w is also an algebraic 
number. 

Let ttf, P;, . . ., run through numbers which are respec- 

tively conjugate to a, p, . . ., ^*1 ti’ it being true that cti = a, 
Pj _ _ X, Pi = p. Consider all possible polynomials 

of the form 


Ti, j. . . . . s. ( i^) = -4- PiJ-""- -r ■ . - -f P( 


so that (pi. 1 1. 1 (x) = (p (x) and take the product of all these 

polynomials: 

^(x) = 1] T*. j *, ( (■*■) 

The coefficients of the polynomial F (x) are obviously symmetric 
with respect to each of*tb(? sy.stemS cij, P;. . . ., hg, pt and there- 
fore (again by the theorem of Sec. 5:1) are polynomials in the coeffi- 
cients of those irreducible polynomials (with rational coefficients) 
whose roots are, respectively, cc, p, . . A. p; that is to say, they 
are themselves rational numbers. The number (o. being a root of 
(p(x), will, consequently, be a root of the polynomial F (x) with 
rational coefficients, i.e., it will be an algebraic num ber. 

Let us apply this theorem to the number w = + b 2. The 

number a = 1 -f- K2 is algebraic by the previous theorem and 
therefore the number w is a root of the polynomial x^ — a with 
algebraic coefficients; that is. it is itself algebraic. Generally, 
applying several times both theorems that have just been proved, 
the reader will easily arrive at the following result. 

Any number written in radicals over the field of rational numbers 
{that is to say, a number expressed in terms of some arbitrarily compli- 
cated combination of radicals—radicals within radicals, in the general 
case) is an algebraic number. 
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Obviously, algebraic numbers written as radicals constitute 
a field. One must bear in mind, however, that this field, as follows 
from the remark made (without proof) at the end of Sec. do, will 
orilv be a part of the field of all algebraic numbers. 

\Ve have already mentioned the transcendence of two numbers: 
e and n. Actually, however, there are an infinity of transcendental 
numbers. What is more, using the concepts and methods of set 
theory, we will show that there are, so to say, even more transcen- 
dental numbers than algebraic numbers. The exact meaning of this 
sentence will be clear from what follows. 

An infinite set M is called countable {deTiumerable) ^ if it can 
be put into one-lo-one correspondence with the set of natural num- 
bers, that is to say, if its elements can be enumerated with the aid 
of the natural numbers, otherwise it is noncountable. 

Lemma 1. Every infinite set M contains a countable subset. 

Indeed, lake an arbitrary element a, in M and then an element 
a. different from fl,. Generally, let there be chosen n distinct elements 

a in M. Since the set M is infinite, it cannot be exhau- 

sted by these elements, and so we can find an element different 
from them. Continuing this process, we will find in M an infinite 
subset com])oscd of the elements 

^ 1 , ^ 2 ’ • • •* • • 

The countability of this .subset is obvious. 

Lemma 2. Every infinite subset B of a countable set A is itself 

countable. 

Because of its countability, the set A can be written as 


Let be the first element of the sequence (1) belonging to B, 
the second element with this same property, etc. Assuming = 

- h„. n — I, 2 we find that the elements of the subset B 

Constitute a sequence 

bn^ • • ‘f b^y . . • 


It is clear that this subset is countable. 

Lemma 3. The union of a countable set of finite sets which pairwise 
do not have any common elements is a countable set. 

Indeed, suppose we have the finite sets 

Aj, Afl, • « •> Afj, ... 

Let their union be B. We will obviously enumerate all elements of 
the set B if, in arbitrary fashion, we number the elements of the 
finite set Ax, then continue the numbering by passing to the elements 
of the set and so on. 
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Lemma 4. The union of two countable sets which are devoid of com- 
mon elements is a countable set. 

Let there be given a countable set A with elements 

f?!* * * * 

and a countable set B with clenienls 

bi^ ^ 2 ’ • • •’ hrn ■ ■ • 

and let the union of these sets be C. If we pul 

then all elements of C will be represented as the sequence 


Cj, ^2. . . Cnn-h • ■ • 

This completes the proof of the countability of this set. 

Now let us prove the following theorem. 

The set of all algebraic numbers is countable. 

First let us prove the couniability of the set of all polynomiah in 
one unknown with integral coefficients. If 

/ {x) = flo^” + ^ -}- a„ 

is such a polynomial (different from zero), let us use the term height 
of the polynomial for the natural number 

= /i + I fly I + I fl, I -f . . . H- 1 fl„_, I -f- [ fl„ I 

It is obvious that there is only a linite number of integral [)olyno- 
mials with a given height /<; denote this set by Denote the set 
Consisting of zero alone by. Uy. The set of all integral polynomial.^ 
will be the union of the countable set of the finite sets il/«. .1/,, 

A/j, . . ., A//,. . . .; that is to say, by Lemma 3. it is countable. 

From this, by Lemma 2, it follows that the set of all integral 
primitive irreducible fwlynomials is also countable. At the same time, 
we know that every algebraic number is a root of one and only one 
integral primitive irreducible polynomial. Consequently, collect- 
ing the roots of all such polynomials, that is, taking the union 
of tile countable set of finite sets, wo obtain the set of all algebraic 
numbers. This .«et will thus, by Lemma 3. be countable. 

Finally, let us jirove the following theorem. 

The set of all transcendental numbers is noncountable. 

Let us first consider the .set F of all real numbers x between 

zero and unity. 0 < x<C 1, and let us prove that this set is noncoun- 

table. We know that each of the indicated numbers x may be written 
as a regular infinite decimal fraction 

X 0, aittj . . . a„ . . . 


23-'J8tl 




354 


CH. 12. POLYNOMIALS WITH RATIONAL COEFFICIENTS 


and that this notation is unique if we do not allow for fractions 
in which for all n beyond some n = all = 9; conversely, any 
fraction of this form is equal to some number x from the set F. Now 
suppose that the set F is countable, that is. that all the numbers x 
can be written as the sequence 







Let 

X)^ = Oj • • • 

be the notation of the number in the form of an infinite decimal. 
Now write the infinite decimal fraction 

0. PiPs - ■ . Pn . . . (3) 

assuming Pi to be different from the first decimal digit of the frac- 
tion xi, that is, p, ^ ail, P 2 to bo different from the second decimal 
digit of the fraction i-e., P 2 ^ a.>n and, generally, ^nn- 

Besides, a.ssume that among the digits p„ there are infinitely many 
that are different from the digit 9. It is clear that there is a frac- 
tion (3) which satisfies all these requirements. It is thus a number 
in the set F, but it is different, by its construction, from all the 
numbers of the sequence (2). This contradiction proves the nonco- 
untahilily of the set F. 

Whence follows the noucountabilitij of ihe set of all complex num- 
bers: if the .‘set were countable, then, by Lemma 2, it could not con- 
tain the uoncoiintable .subset F. The noncountahility of the set 
of all lraii.<cendental numbers is now, by Lemma 4, obvious, since 
the union of this set with the countable set of all algebraic numbers 
is the set of all complex numbers, that is to say, it is noncountable. 

riie two theorems we have proved show us, due to Lemma 1, 
that the set of the transcendental numbers is indeed much richer 
in elements (that is to say, more “potent”) than the set of algebraic 
numbers. 
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NORMAL FORM 
OF A MATRIX 


59. Equivalence of ^.-Matrices 

We return now to problems of linear algebra. Chapter 7 demon- 
strated the important role of the concept of similarity of matrices. 
Namely, two .square matrices of order n are similar if and only if 
they represent (in different bases) the same linear transformation 
of n-dimensional linear space. However, we are not yet able to tell 
whether two given .specific matrices are similar or not. On the other 
hand, among all matrices similar to a given matrix A. we are not 
able to indicate a matrix of elementary form (in one sense or another); 
even the question of the conditions under which a matrix A is simi- 
lar to a diagonal matrix was considered in Sec. .33 only for a parti- 
cular case. These are the questions we will lake up in this chapter. 
(Note that they arc discussed straight off for the case of an arbitrary 
base field P.) 

Let us first investigate square matrices of order n whose elements 
are polynomials of arbitrary degree in a single unknown "K with 
coefficients from the field P. These are called polynomial matrices 
or, briefly, X-matrices. An example of a ^-matrix is the characteristic 
matrix A — XE of an arbitrary square matrix A with elements inP. 
The principal diagonal of this matrix contains first-degree polyno- 
mials, all off-diagonal elements are zero-degree polynomials or 
zeros. Every matrix with elements from the field P (for brevity, we 
call them numerical matrices) is also a special case of a X-matrix: 
its elements are polynomials of degree zero or zeros. 

Suppose we have a X-matrix 

A (X) = 


We use the term elementary transformations of this matrix for the 
following four types of transformation: 
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(1) multiplicalion of any row of the matrix A (^) by any scalar a 

in P different from zero; i • n 

(2) multiplicalion of any column of A (X) by any scalar am/ 

different from zero; . . r ..u e 

(3) addition, to any Hh row of matrix A (X), of any ;tn row of it, 

/ = 5 ^ 1 , multiplied by any polynomial (p (X) in the ring f ' 

(4) addition, to any ith column of matrix A (X), of any /th column 
of it, j /. multiplied by any polynomial (p (X) in the ring P IXj. 

It is readily seen that for every elementary transformation of the 
X-matrix there is an inverse transformation which is also elementary. 
Thus, the inverse of (1) is an elementary transformation consisting 
in the multiplication of that row by the number a-^ which exists 
due to the condition a 0; the inverse of (3) is a transformation 
which consists in adding to the (th row the ;th row multiplied by 
— (p (X). 

It is possible to interchange any tuv rows or any two columns in 
a matrix A (?.) by a number of elementary transformations. 

Suppose wo wish to interchange the ah and /th rows of *4 (X). 
This can bo accomplished by means of four elementary transformations 
as the scheme below illustrates: 



The sequence of transformations is: (a) add /th row to (th row; (b) sub- 
tract the new (th row from the /th row; (c) add the new /th row to the 
new (th row; (il) nuilli])ly the new /th row by —1. 

We will say that the X-matrice.s .1 (X) and B (X) are equivalent 
and we will write .4 (X) B (X) if the matrix .4 (X) can be carried 
into the matrix B (X) by means of a Unite number of elementary 
transformations. This equivalence relation is obviously reflexive 
and tran.'iitive and al.so symmetric, due to the existence of an inverse 
elementary transformation for every elementary transformation. 
In other words, alt square X~matrice.^ of order n over the field P break 
up into (lisji'inf classes of equivalent matrices. 

Our itninediale aim will be to t'lnd the simplest kind of matrices 
among all the X-matrices equivalent to the given matrix A (X). 
To do this, we introduce the following concept. A canonical X-matrix 
is a X-matri.x with the following three properties: 

(a) the matrix is diagonal, that is, of the form 
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(b) any polynomial e, (X), i = 2, 3 «. is exactly divisible 

by the polynomial (?.); • i /i\ • i 9 

(c) the leading coefficient of every polynomial ej (A), i - i. ... 

n is equal to unity if the polynomial is nonzero. 

Note that if among the polynomials C; (A) on the principal diago- 
nal of the canonical A-matriv (1) there are some equal to zero, then, 
bv property (b). they invariably occupy the last positions on the 
principal diagonal. On the other hand, if there are zero-degree poly- 
Lmials among the polynomials C; (A), then, by Property (c), they 
are all equal to unity, and, by Property (b), they occupy the fir..t 
positions on the principal diagonal of the matrix (1). 

^ The canonical A-matrices embrace, among others, the numerical 

matrices including the unit and zero matrices. . , . , 

Any i-matrix is equivalent to some canonical A-mfl/nj-. that is tosaij, 
it Canute reduced to canonical form via elementary ttansformatwns. 

We will prove this theorem by induction with respect to the 
order n of the A-matrices at hand. Indeed, for n - 1 we have 

A (A) = {a (A)) 

If ad) = 0 then our matrix is already canonical. Hut if a (1) ^ 0, 
then it suffices to divide the polynomial a (A) by its leading coef- 
ficient (this is an elementary matrix transformation) in order to «e 

“ "Tllf ^‘Ibeorem has been proved for V.na, rices of 
order n -1 We consider an arl)itrary X-matrix .1 (X) of order ». 
HU is a rero matrix, it is already canonical and no proof .s needed. 
We iLrefore lake it tliat there are nonzero elements among the 

elements of matrix A (A). . 

Interchanging (if necessary) rows and columns of .4 (A), we can 

move one of the nonzero elements into the upper left-hand corner 

Thus of the X-inatrices equivalent to A (X). there aie some wit 

a nonzero polynomial in the upper left corner. Let us consider all 

such matrices^ The polynomials in the upper left corner of these 

r^atriSs may have dmerenl degrees. Hut the degree ol a polyno.nial 

is a natural number, and in any nonempty set o natural niimher.s 

there is a least number. It is thus possib e to hnd. trom among al 

the X-matrices equivalent to U (X) and having a nonzero elemenl 

in the upper left corner, one matrix such tliat the j;) 

unner lef corner is of the lowest po.«.sible degree. I* inally, di\ iding 

the first row of this matrix by the leading coeflicenl of the indicated 

polynomial, we get a X-matrix equivalent to zl (X). 

(A) (A) • . • f^i/j (A) 


A (A) 


(A) (A) . . . ^>2/1 

g ▼ 

hfii (A) hn2 (A) ■ ■ ■ (A) 
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such that $1 (A.) 0, the leading coefficient of this polynomial is 

equal to unity, and no combination of elementary transformations 
can carry the resulting matrix into a matrix in which the upper left- 
hand corner would be occupied by a nonzero polynomial of lower 
degree. 

We now prove that all elements of th^ first row and first column of 
the matrix obtained are exactly divisible by ei {X). Suppose, for example, 
for 2 ^ ^ n, 

bij {X) == e, (X) 9 (X) + r (X) 


where the degree of r (X) is less than the degree of ei (X) if r (X) is 
different from zero. Then, subtracting from the ;th column of our 
matrix the first column multiplied by q (X) and interchanging the 
first and ;lh columns, we obtain a matrix equivalent to A (X) in the 
upper left corner of which is the polynomial r (X), that is to say, 
a polynomial of lower degree than Cj (X), which contradicts the 
choice of this polynomial, whence it follows that r (X) = 0. The 
proof is complete. 

Now subtracting from the /Ih column of our matrix the first 
column multiplied by q (X), we replace the element 6,^ (X) by zero. 
l\‘rforming such transformations for / = 2, 3, . . n we sub- 
stitute zeros for all elements b^ (X). In similar fa’sliion’we’ substitute 

zeros for all elements b-^ (X), i = 2, 3 u. We thus arrive at 

a matrix, equivalent to A (X). in the upper left corner of which is Oie 
polynomial c, (X), all other elements of the first row and the first column 
being zero: 

(X) 0 ... 0 

0 Coo (X) ... Con (X) , 

(^1 


-1 (X) 


0 Cnz (X) . . . (X) 

13y the induction hypothesis, the matrix of order n — 1 in the 
lower right corner of the matrix (2) that we have obtained can be 
reduced to canonical form by elementary transformations; 

Co (X) 0 


^•11 (X) 


Con (X) 


.C„o (X) . . . Cnn (X)/ (?) 

Having performed lliese same transformations on the' corresponding 
rows and columns of matrix (2) (in the process, the first row and 
first column will obviously remain unchanged), we find that 

Cj (X) 0 
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To prove that the matrix (3) is canonical, it remains to demon- 
strate that (?.) is exactly divisible by e, (a). Suppose 

^2 (^) = (^) 9 (^) ^ 

where r {\) 0 and the degree of r (X) is less than that of ej (>.). 

However, by adding to the second column of (3) the first column 
multiplied by q (K) and then subtracting the first row from the 
second, we replace the element Cn (M element r (X).T^hon, 

by interchanging the first two rows and the first two columns. \\e 
transfer the polynomial r (1) to the upper left corner of the matrix, 
but this contradicts the choice of the polynomial Ci (X). 

The theorem on the reduction of a X-matrix to canonical form 
is proved. We have to supplement it with the following uniqueness 

theorem. 

Every X-malrix is equivalent to one canonical matrix only. 

Suppose we have an arbitrary X-matrix A (X) of order n. lake 
some natural number /.*, and consider all Xlh-order 

minors of A (X). Computing these minors, we obtain a finite system 
of polynomials in X; we denote the greatest common divisor of this 
system of polynomials with leading coefficient 1 by (X). 

We thus have the polynomials 

di (X), d^ (X), . . m (4) 

which are uniquely defined by the nitalrix A (X) itself. Here, di (X) 
is the greatest common divisor of ail elements of A (X) with coef- 
ficient 1, and dr, (X) is equal to the determinant of tlie matrix A (X) 
divided by its leading coefficient. Also note that if the matri.x A (X) 
has rank r, then 

dr + i (X) = . . . = dn (?-) = 0 

whereas all the remaining polynomials of system (4) are different 
from zero. 

The greatest common divisor d,, (X) of all minors of order k of the 
X-niatrix A (X). X- = 1, 2, . . n, remains unchanged under ele- 
mentary transformations of A (X). 

This assertion is almost obvious when an elementary transfor- 
mation of type (1) or (2) is performed in matrix A (X). For instance, 
if the ith row of the matrix is multiplied by a number a in the field I\ 
a then the Xth-order minors through which the ith row passes 

will be multiples of a, whereas all the other Xth-order minors will 
remain unchanged. 13ut when seeking the greatest common divisor 
of several polynomials, any one of the polynomials can bo multiplied 
with impunity by nonzero numbers from P. 

Let us now consider elementary transformations of type (3) 
or (4). Let us, say, add to the ith row of A (X) the ;th row, / i, 
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multiplied by the polynomial (p (X); denote the resulting matrix 

by A (A,) and denote by df^ (A.) the greatest common divisor of all 
its Arth-order minors taken with leading coefficient 1. Let us see 
what happens to the A:th-order minors of A (A,) under this transfor- 
mation. 

It is clear that minors through which the ith row does not pass 
remain unchanged. Likewise, there is no change in those minors 
through which both the ith and /th rows pass, since a determinant 
is unaltered by adding a multiple of one row to another row. Finalljs 
let us take any A'th-order minor with the ith row passing through it, 
but not the ji\\ row; denote it by M. The corresponding minor of the 

matrix A (A) can evidently be represented by the sum of the minor M 
and the minor M\ multiplied by (p (A), of the matrix A (A), which A/' 
is obtained from M by replacing the elements of the /th row af A (A) 
by the corresponding elements of its ;th row. Since both M and M' 
are divisible by df^ (A), it follows that A/ -r (p (A) M' will also be 
divisible by d,^ (A). 

From the foregoing it follows that all the A*th-order minors of 

matrix A (A) are exactly divisible by d^ (A) and therefore d^ (A) 
loo is divisible by d^ (A). But since the elementary transformation 
at hand has an inverse of the .same type, it follows that d,^ (A) is 


likewise divisible by dh (A). But if one takes into account that the 
leading coefficients of both these polynomials are equal to unity, 

then (//, (A) - d,, (A), which completes the proof. 

Tims, all L-matricesi equiralent to the matrix A (A) are associated 
with one and the same set of polynomials (-1). Specifically, this refers 
lo any one (if there are .several) canonical matrix equivaient to A (A). 
Let {'A) be stich a matrix. 

Let us coiiijiiite the polynomial d,, (A). A - 1,2 n. using 

matrix (d). (Aeaidy. the A'th-order mitun‘ in the ujiper left corner of 
Ibi.s matrix is equal to the product 


c, (A) Co (A) 


e 


(>-) 




Fnilhennnie, if we take, in matrix (2). the A'th-order minor in the 

rows with indices /., i,^, where /, </,...< ami in 

columns with the same indices, then thi.s minor is equal lo the product 
(A) . . . (A) which is divisible by (5). Indeed, 1 ^ /, 

and so (A) is divisible l.y c, (A), 2< and therefore e,., (A) is 

divisible by e^ (A), and .so on. rinally, if in matrix (2) we take the 
A'lh-order minor, through which the /th row of this matrix passes for 
at least one / but does not pa.ss its /111 column, then this minor con- 
tains a zero row and is therefore equal to zero. 

It follows from the foregoing that the product (.5) will be the 
greatest common divi.sor of all Ath-order minors of matrix (3) and, 
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therefore, of the original matrix A (X), 

^ (^) ^2 (X) . . . (X), X- = 1, 2, . . n (li) 

It is now easy to show that the polynomials (X). X - 1. 2, . . . 
. . n, are uniquely determined by the matrix A (X) itself. Li*l tljk> 
rank of this matrix he r. Then, as we know, dr (X) ^ 0, hut drJ^i (X) — 
= 0, and therefore, by (G), <?r + i (X) -- 0. Whence, because of the 
properties of a canonical matrix, it follows generally that if I lie 
rank r of matrix A (X) is less than n. tlien 


er^\ (X) = ^r+2 (X) ^ = en (X) = 0 



On the other hand, for X ^ r. it follows from (G), because .,X U, 
that 


(X) 


4(X) 
<lh-\ (X) 



This complete.s the proof of the uniqueness of the canonical form 
of the X-matrix. At the same time we have obtained a direct proci*- 
dure for finding polynomials e^ (X), which are called invariant factors 
of the matrix A (X). 


Example. Reduce the X-malrix 



On the other hand, it iiiiifht he po.^sihle to compute the itivariarit factors 
of the matrix /I (X) directly. .Namely, computing the greatest common divisor 
of the elements of this matrix, we obtain 

dx (X) = ft (X) = X 

Now, computing the delcnninant of A {X) and noting that its loading coeflicient 
is equal to !, we obtain 

c/2 (X) = - luX^ - 3X= 



10X2 — .U 

rf, (X) 


and so 
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60. Unimodular X-matrices. Relationship Between Similarity 

of Numerical Matrices and the Equivalence 

of Their Characteristic Matrices 

From the results of the preceding section there follows a criterion 
of equivalence of ^.-matrices, which may be stated in either of two 
almost identical formulations. 

Two %-matrices are equivalent if and only if they can be reduced to one 
and the same canonical form. 

Two X~matrices are equivalent if and only if they have the same inva- 
riant factors. 

Let us derive another criterion of a different nature. 

VVe know that the unit matrix E is a canonical X-matrix. We 
call the X-matrix U (X) ufiimodular if it has the matrix E for its 
canonical form; that is to say, if all its invariant factors are equal 
to unity. 

The X-matrix U (X) is unimodular if and only if its determinant is 
nonzero but does not depend on X; that is, it is a nonzero number of the 
base field P. 

Indeed, if U (X) ^ E, then these two matrices arc a.ssociated 
with one and the same polynomial c/n (X). However, dJ^ (X) = 1 
for the unit matrix. From this it follows that the determinant of the 
matrix U (X). which determinant differs from d^^ (X) only by a non- 
zero numerical factor, will he a iionzero number of the held P. 
Conversely, if the determinant of the matrix U (X) is different from 
zero and is not dependent on X, then for this matrix the polynomial 

(X) will be equal to unity and therefore, by (6) of Sec. 59, all 
invariant factors C; (X) of U (X), i = 1, 2, . . ., n, are equal to unity. 

'I'liis implies that any nonsingular numerical matrix is a unimodu- 
lar X-matrix. However, a unimodular X-matrix can he very compli- 
cated. Thus, the X-matrix 

/ X X=* -5 

\}2 - - 4 X-» ~ X" - 4X- -h 5X 

is unimodular, since its determinant is equal to 20; that is to say, 
it is different from zero and is not dependent on X. 

From the theorem proved above it follows that a product of uni- 
modular X-matrices is unimodular: it suffices to recall that in matrix 
multiplication the determinants are multiplied together. 

The X-matrix U (X) is unimodular if and only if there is an inverse 
matrix which is also a X-matrix. 

Indeed, if we have a nonsingular X-matrix, then in seeking the 
inverse matrix in ordinary fashion we will have to divide the cofac- 
tors of the elements of the given matrix by the determinant of the 
matrix, i.e,, by some polynCmial in X. Therefore, in the general 
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case, the elements of the inverse matrix will bo rational fractions 
in \ and not polynomials in k; that is. this matrix i.s not a A-matrix. 
But if a unimodular matrix is given, then we will have to divide 
the cofactors only by a nonzero number from the field P\ i.e.. the 
elements of the inverse matrix will be polynomials in k and therefore 
the inverse matrix will itself be a ^.-matrix. Conversely, if the 
X-matrix U {k) has an inverse ?.-matrix the determi- 

nants of both matrices are polynomials in k, their product is equal 
to 1, and therefore both determinants must be zero-degree polyno- 
mials. 

There follows from this last remark a supplement to the theorem 
just proved: A k-matrix inverse to a unimodular k~matrix is unimodular. 

The concept of a unimodular matrix is used in the statement 
of the following new equivalence criterion of ?.-matrices: Two k-matri- 
ces A (X) and B {X) of order n are equivalent if and only if there exist 
unimodular k-matrices U {k) and V (/.) of the same order n such that 

B{k) = U (X) A (X) V (k) (1) 

First, we introduce the following concept used in the proof of 
this criterion. We use the term elementary matrix to denote a numeri- 
cal (and, hence, X-) matrix 



that differs from the unit matrix in only one way: tliere is an arbi- 
trary nonzero number a from tiie hold P in some illi position of the 
principal diagonal. 1 < i < On the other hand, we will use the 
term elementary matrix for the X-matrix 



( 3 ) 
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which differs from the unit matrix in only one way: an arbitrary 
polynomial (p (X) from the ring P [XI occupies the position at the 
intersection of the ah row and the /th column. 1 < f < /z, 1 < 7 < 
TX t / * 

Every elementary matrix is unimodular. This is quite obvious 
since the determinant of matrix (2) is equal to a, but, by hypothesis, 
a =r^ 0; however, the determinant of the matrix (3) is equal to 1. 

Performance of any elementary transformation in the \-matrix A (X) 
is equivalent to multiplying this matrix on the left or on the right bij 
some elementary matrix. 

It will be easy for the reader to verify the truth of the following 
four a.^sertions: (1) multiplication of the matrix A (X) on the left 
by the matrix (2) is equivalent to multiplication of the ah row 
of A (X) by the scalar a; (2) multiplication of A (X) on the right by 
matrix (2) is equivalent to multiplication of the /th column of the 
matrix A (X) by the scalar cc (3) multiplication of matrix A (X) 
on the left by matrix (3) is equivalent to adding to the ith rowof/l (X) 
ils/th row multiplied by fp (X); (-''i) multiplication of the matrix A (X) 
on the right by matrix (3) is equivalent to adding to the /th column 
of A (X) its /th column multiplied by cp (X). 

Lot us now take \ip the j)roof of our criterion of the equivalence 
of X-mat rices. If A (X) — H (X), then we can proceed from A (X) 
to P (X) by moans of a finite number of elementary transformations. 
Re|)lacing each of those transformations by multiplication on the 
left or on the right by an elementary matrix, we arrive at the equation 

B (X) - (X) . . . U, (X) .1 (X) T, (X) . . . Vi (X) (4) 


where all the matrices t/, (X) U]^ (X), Kj (X), . . ., Vj (X) are 

elementary and, hence, unimodular. Hence, the matrices 

U (X) = £/, (X) ... [/„ (X). V (X) - V, (X) ... Vi (X) (5) 

which are products of unimodular matrices will al.«o be unimodular, 
and equation (4) will he rewritten as (1). Notice that if. say, 
/.• 0. i.e., elementary transformations are performed on columns 

only, then we simply put U (X) - E. 

'I'his jioiiion of the proof already allows us t») make the follo- 
wing a.s.'^ertion. 

. 1^1 k-matrix is unimodular if and only if it is representable as a pro- 
duct of elementary matrices. 

True enough, for we have already taken advantage of the fact 
that a product of elementary matrices is unimodular. Conversely, 
if we have an arbitrary unimodular matrix IT (X) then it is equiva- 
lent to the unit matrix E. Applying the foi'cgoing proof to matrices E 
and W (X) instead of A (X) and B (X), we get from (4) the equation 

IT (X) = C/, (X) . .. (X) V, (X) . . . T, (X) 
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which is to say that the matrix W (1) is represented as a product 
of elementary matrices. 

It is now easy to prove the converse assertion of our criterion. 
Suppose that for the matrices A (k) and B {k) there are unimodular 
matrices U (k) and V {k) such that (1) holds. From what has been 
proved, the matrices U (X) and V (X) may be repre.senled as products 
of elementary matrices; let these be the representations (o). Then (1) 
oan be rewritten as (4) and. substituting the corresponding elemen- 
tary transformation for each multiplication by an elementary matrix. 

we finally obtain A (k) B (X). , , „ c \ 

Matrix polynomials. We can take an entirely different view of the 

X-matrix concept and use tlie term rnalrix k-polynomial of order n over 

ike field P for a polynomial in X whose coefficients are square matrices 

of the same order n'with elements from the field P. Its general aspect is 


.i4oX** -f- /4iX^‘ * 




( 6 ) 


Regarding (in accordance with Sec. 15) the mullijilicalion of 

matrix A, by X i - 0, 1 k. as the multiplication by V- 

of all elements of the matrix .-1,. and then performing matrix addi- 
tion in accord with that same Sec. 15. we find that any matrix k-poly- 
nomial of order n may be written as a k-matnx of order n. liuis. 

u;)- c 1) - c J) >-k: :) 

Ak'^ + X -3X= 4- 2X -- I 


-X^ 


X" k‘ - 2k 




Conversely, any k-matrix of order n may be written in the form of 
matrix k-polynomial of order n. I hus, 


X^-h2X 

The correspondence between X-matrice.s and matrix X-polynomials 
is one-to-one and isomorphic in the meaning of Sec. 4t). Indeed, the 
equality of X-polynomials of the form (tl) as matrices is equivalent 
to the equality of matrix coefficients of identical powers of X. and the 
multiplication of a matrix by X is equivalent to its multiplication 
by a scalar matrix with h on iho principal diagonal. 

Suppose we liave a X-matrix A and 


A (X) = Ao)J' + /IjX''"^ ^,.-iX A 


where the matrix Ao is not a zero matrix. We call the nnmlier k 
the degree of the X-rnatrix A (X); clearly, this is the liighest power 

On X) of the elements of the matrix A (X). 

The view taken of X-matrices a.s matrix polynomials permits 
developing for X-matrices a theory of divisibility similar to the 
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theory of divisibility for numerical polynomials, made more com- 
plicated, true, by the noncommutativity of matrix multiplication 
and the presence of divisors of zero. We restrict ourselves to the sole- 
problem of the division algorithm (with remainder). 

Given, ovei- the field P, the nth-order X-matrice$ 

A (?,) = AoX'^ A,X“-^ -h . . . + A,.^X + A,. 

B (X) = BoX^ BiX^'^ + Bi.iX + B[ 

Assume that the matrix Bq is ?ionsinguhr, i.e., there exists a matrix 
Then, over the field P it is possible to find 7,-matrices Qi {X) and 
/?j {}•.) of the same order n such that 

A{X)=B {>.) (?, (?0 + y?. {X) (7> 

The degree of Ri (a) is less than the degree of B {X) or /?, (?.) — 0. 
On the other hand, there ore, over P, X-matrices (7v) and /?„ (>.) 
of order n such that 

A (X) = Q, {X) B {X) + B, {X) (8) 

The degree of fh (?•) l^ss than the degree of B (?.) or {?.) = 0. The 
matrices Qi (?.) and /f, (/.) and also Q., (?.) and R^ (X) which satisfy 
these conditions are uniquely determined. 

The proof of this theorem follows the same lines ns that of the 
corresponding tlieorem for numerical polynomials (see Sec. 20). For 

in.‘itance. let condition (7) be satisfied also by the matrices W 

and (a) and the degree of /?, (?.) is less than the degree of B (X). 
'I'hen 

B {>■) 1(^1 (>.) - (A) I lU (X) - /?. (X) 

Tlie degree of the right side is less than /. but the degree of the left 
sidi* (if the s([uare bracket is nonzero) is greater than or equal to I, 
sinci' tin* matrix /.'o is nonsingular. Whence follows the uniqueness 
of the matrices Qi (X) and /i| (X). 

To prove the existence of such matrices, notice that for I 
the degree of (he difference 

.1 (X) -y?(X)-y?-^^oX'‘-' 

will be strictly less than k\ therefore Bq‘^A„7}~^ will be the highest- 
degree term of the matrix X-polynomial Qi (X). The continuation 
is the same as in Sec. 20. On the either hand, the degree of the diffe- 
rence 

A (X) - AqB:'7:'-^-b {X) 

is also .strictly less than k, that is, .4nBo"‘X^"' will be the highest- 
degree term of the matrix X-polynomial (X). We see that the 
X-matrices Qi (X) and Q., (X) (and also y?i (?.) and /?« (X)l which satisfy 



60. UNIMODULAR X-MATRICES 


367 


the conditions of the theorem, will indeed be distinct in the general 
case. 

Fundamental theorem on tlie similarity of matrices, Ijarlior 
we mentioned the fact that as yet we have no way of deciding whether 
two numerical matrices A and B (that is. matrices with elements 
in the base field P) are similar or not. On the other hand, their cha- 
racteristic matrices A — and B — }.E are ^-matrices and the 
question of the equivalence of these matrices is something that can 
be resolved effectively. It is therefore clear why the following theo- 
rem is of such great importance. 

Tke matrices A and B with elements in the field P are similar if 
and only if their characteristic matrices A XE and B XE aie 

equivalent. , . , . 

Indeed, let the matrices A and B be similar, i.e., there is, over 

the field P, a nonsingnlar matrix C such that 

B = C-XAC 


Then 

C-i (A - XE) C - C-'.ir - X iC-'^EC) = B - XE 

The non«ingular numerical matrices C'* and C are, howe\er, unimo- 
dular X-matrices. We see that the matrix B — XE is obtained by 
multiplying the matrix A - XE on the left and on the right by uni- 

modular matrices, that is, A — XE B XE. 

Proof of the converse assertion is more complicated. Let 

A - XE B - XE 


Then there exist uniinodular matrices U {?v) and \ (?.) such that 

U {X) {A — XE) V (X) = B — XE (9) 

Taking into account that unimodular matrices have inverse matrices 
which are X-matrices, we derive from (9) the following equalities 
which will be used in the sequel: 

U {X) {A - XE) = (B - XE) y-* (X) 1 
(A - XE) V {X) = U-^ iX) {B - XE) f 

Since the ?.-matrix B - XE has degree 1 in X. the nonsingular 

matrix E serving os tlu* loadii^g coofliciont of the corrosponding 

matrix polynomial, it follows that we can apply the division algo- 
rithm to the matrices U (X) and B - XEi there are matrices Q, {X) 
and /?, (the latter, if nonzero, must have degree 0 in X, i.e., it is 
independent of X) such that 

U{X) = {B -XE)Qi {X) -h R, (11) 


Similarly, 


V {X) = Qi (X) (B — XE) -1- /?2 


( 12 ) 
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Using (11) and (12), we get, from (9), 
n, (.1 - XE) H. = {B - XE) - U (X) [A - XE) (X) {B - XE) 

-{B ~ XE) Qi (X) {A - XE) V (X) 

-h (B - XE) Q, (X) {A - XE) Q, (X) {B - XE) 

or, l)y (10), 

(^1 - XE) B,^{B~ XE) - (£ - XE) (X) Q, (X) {B - XE) 

-{B - XE) (X) U-^ (X) {B - XE) 

-{B- XE) Q, (X) {A - XE) Q, (X) (5 - XE) 

^-{B - XE) {E - lU-i (X) Q, (X) -r Q, (X) U-^ (X) 

- Qi (?0 {'‘I - >^E) Q. (X)l {B - XE)} 

'riie square bracket on Ihe right is actually zero, for otherwise, 
being a X-malrix (since both U"' (X) and U~^ (X) are X-matricesl, 
it would at least he of degree 0, but then the degree of the curly 
brackets would not be less than 1 and. hence, the degree of the entire 
riglil member would not be less than 2. But this is impossible since 
on the left-hand side we have a X-malrix of degree 1. 

Tims. 

/?, {A ~ XE) H., -= /? - XE 

whence, equating the matrix coefficients of identical powers of 
X we get 

lUAlL - B, ( 13 ) 

HJi. - E (14) 

iM|nallon (li) shows that the numerical matrix /X, is not only non- 
zero [ntl is even iionsingnlar, and 

HA - //i 


But thiM) equation (13) lakes the form 

- B 

which pnivi'S the similarity of llu* matrices .1 .iiu! />. 

We h.i\'e at the same lime h'arned to find the nonsingular mat- 
rix B.. wliicli I |■allst■orms matrix .1 into matrix B. Namely, if the 
matrices .1 -- ami />’ — f.E are (‘(luivalent, then a finite number 

Iransfonnalioiis carries the first into Ihe second. Take 
those transformations which refer to columns; ilenote by V (X) the 
jiroducl of the corresponding elementary matrices taken in the 
same order. 1 lien ilivide 1 (X) by B — XE and perforin the division 
.<0 that the quotient is on the left of the divi.'<or Isee (8)1. The remain- 
der of this division will he just the matrix /T>. 

Actually, this division need not he performed; one can take 
advantage of the Following lemma, which will al.<o be of use in Sec. 62. 
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Lemma. Let 

V (k) = . . . - -h 

If V (>.) = i'^E - B) Q, (X) H- /?„ 

V (X) - Q. (X) {KE - B) ^ R. 

then 


Vo ^0 


(15) 

(10) 


(H) 


/?, = B'Vo - 

/?2 = VoB* + + . . . r V,.,B + F, 

It suffices to prove the first of these two assertions, because the 
second is proved similarly. The proof consists in direct verification 
of the validity of (16) if the polynomial V (X) is replaced by its nota- 
tion (15), if (17) is substituted for /?i. and if in place of (?, (X) we 
take the polynomial 

(?, (X) = FoV-i -f {BVo + F.) X'-= -I- {BH^o + BV, + F.) X^'^ 

+ . . . {B'-^Vo 4- B'~^~\\ -r - . - 4- F^.,) 

This verification is left to the reader. 

Example. Given the matrices 

— 2 Iv /— 10 — 




Their characteristic matrices are e<puval(Mit since tliey can bo rcducod to one 
and the same canonical form 


{I . _ o) 


The matrices A and B are thus similar. 

To find the matrix /to that transforms .1 into //, jot us Imd some chain 
of elementary lransforniati<ins that carries .-1 — X£.’ into B \h. Thus, 

A IP /-2-X 1 \ / -2->. 1 \ / 8 + -'.X -4 \ 

A~‘KE=\^ 0 3-xj ^ l-ir.^ 8X 11 -Xj II-/. j 

/40-f/.X _/. V /-10-X -4 \ 

~ ( _f04 11 Xj \ 26 Il-Xj 

The last two tran.sformations refer to columns: to the first column we add the 
second multiplied by —8 and then we multiply the first column by — — . The 
product of the corresponding elementary matrices will be 




This matrix does not depend on X and therefore it is the souglit-for malri.x /to. 

Of course, the matrix that transforms A into B is not by far delerininod 
uniquely. For example, the matrix 

(U) 


^ill also be of that kind. 
24—986 
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61. Jordan Normal Form 


We will now' consider nth-order square matrices with elements 
in the field P. We will isolate a special type called Jordan matrices, 
and it will be shown that these matrices serve as a normal form for 
n very broad class of matrices. Namely, matrices, all the characteristic 
roots of which lie in the base field P (and only such matrices) are similar 
to certain Jordan matrices', we say that they can he reduced to a Jordan 
normal form. It will then follow, if for the held P we take the field 
of complex numbers, that any matrix with complex elements can be 
reduced to a Jordan normal form in the field of complex numbers. 

We will need some definitions. A A'th-order Jordan submatrix 
referring to the number Xo is a matrix of order k, \ ^n, of the 

form 



III oilier words, one and the same number '/.a from the field P occupies 
I lie prinrijial fliagonal. with unity along the diagonal immediately 
above and zero elsewhere. Tims, 

iU. 



are. rc.^pecl ively. Jordan suhmatrices of lirst. second and third order. 
.\ Jordan matrix of order n is a matrix of onler n having the form 




\ 


0 


J 


8 




The elements along the principal diagonal are Jordan suhmatrices 

y,, / Jgoi certain orders, not necessarily distinct, referring 

to certain numbers (not necessarily di.'^tinct either) lying in the 
field P. All other positions have zeros. Here, s^ 1. that is to say. 
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one Jordan subnaatrix of order n belongs to Jordan matrices of this 
order, and, naturally, 5 ^/ 1 . 

It may be noted (though this will not be used in what follows) 
that the structure of the Jordan matrix can be described without 
resorting to the concept of the Jordan submatrix. It is obvious, 
namely, that the matrix / is a Jordan matrix if and only if it has 

the form 

(K , 

Xo e-. I 


Vo 


f,.-i 




y 


where Xi, i = 1, 2, . . are arbitrary numbers in P and every 
, y = 1, 2 n — 1, is equal to unity or zero; note that if 

tj = 1, then Xy — X;+i. , r , , . r,-, 

Diagonal matrices are a special case of Jordan matrices, lliese 

are Jordan matrices who.'^e Jordan submatrices are of order 1. 

Our immediate aim is to find the canonical form of the characte- 
ristic matrix / - XE of an arbitrary Jordan matrix / of order 
We will first find the canonical form of the characteristic matrix 

1, I " 


0 


I 

X,i — X 


( 3 ) 


of a single Jordan submatrix (1) of order k. Computing the determi- 
nant of this matrix and recalling that the leading coefficient of llie 
polynomial d* (X) must be equal to 1, wo find that 

dft (X) ^ (X — Xo)*' 

On the other hand, among the {h - l)lh-order minors of the matrix 
(3) there is a minor equal to unity; this is the minor obtained by 
deleting the first column and the last row of the matrix. Therefore 

d,.., (X) - 1 


From this it follows that the following Xlh-ordcr l-matrix 




is the canonical form of the matrix (3). 
We now prove the following lemma. 


24 * 
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If the polynomials 91 (X), ips (X), . . tpf (X) from the ring P [X) 
are pairwise prime, the following equivalence holds true: 



It is evidently sufficient to consider the case oi t — 2, Since the 
polynomials tpi (X) and 92 (^) ^re relatively prime, there are polyno- 
mials ut (X) and Uo W ring P [XI such that 

(pi (X) ui (X) + (po (X) U 2 (^) = 1 

Therefore 


/rp, (X) 0 \ (^) <Pi (>0\ 

VO (PcWi'^V 0 ) 

/(P, (X) (pi (X) Ut (X) + 92 (X) Ua (X)\ _ /91 (X) 1 \ 

V 0 92 (X) j “ I 0 92 (X)J 

Vp 2 {^) 0 / Vo — 9 i (X) 92 (X)/ 

0 /I ^ 

\0 —9, (X) 92 (X)j VO 9 t (X) 92 {X) 

whicli is what we set oiil to prove. 

Lei us now consider the characteristic matrix 



of the Jordan matrix J of type (2); here, E-,, i = i. 2 s, 

is a unit matrix of the same order as the submatrix /j. Let the 
Jordan suhmatrices of the matrix J refer to the following distinct 
numbers: Xj, Xj. ■ . X(, wliere t ^ s. Furthermore, let there refer 

to the number X;, i ^ 1. 2. . . t, q-, Jordan submatrices, g, ^ 
and let the orders of the siibmatricos (arranged in nonincreasing 
order) be 



(6) 
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Let it be noted (though we will not make use of this fact) that 

1 

'^qi = Sy 
i=l 


t 

V 

i=i i=i 


y, kij = 


n 


Applying elementary transformations to the rows and columns 
of matrix (5) which pass through the submatrix. Ji — }.Ei of this 
matrix, we will quite obviously not involve the other diagonal 
submatrices, whence it follows that it is possible, in matrix (5), 
to replace by moans of elementary transformations every submatrix 

j _ Xf ', i = 1, 2 5 . by a corresponding submatrix of the 

type (4) in other words, the matrix J — XE is equivalent to a diagonal 
matrix, the diagonal elements of which consist {aside from a certain 
number of units) of the following polynomials which correspond to all 

Jordan submatrices of the matrix J: 


a-x,)'-, (>v- ) 









(X-X,)'". 


( 7 ) 


t 


) 


We do not indicate the positions of the polynomials (i) on the 
principal diagonal, since the diagonal elements of any diagona 
X-matrix can bo arbitrarily rearranged by interchanging rows and 
like columns. This is worth bearing in mind for the future. 

Let q ho the largest of tlie numl)ers q^^ / = 1, t- Denote 

by en (X) the product of polynomials in the ;tli column of array 
(7), }"= 1, 2, . . q, that is. 

II (S) 

i— 1 


If there arc certain vacancies in the /Ih column — it may ha[)pen 
that qt <j for certain i— then the corresponding factors in (8) are 
considered to bo unity. Since, by hypothesis, the luimhors 

X,, X.. X/ are distinct, the powers of the linear binomials in 

the ;th column of array (7) are pairwise relatively prime. Therefore, 
on the basis of the lemma proved above, they can, by means of 
elementary transformations, be replaced in the tiiagonal matrix 
at hand by their product + , (X) and by a certain number 

of units. 
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Doing this for / = 1, 2, . . g, wo find that 

1 0 


J-IE 


1 


0 


^n-5+i (^) 

% 

0-) 

Bnl 


( 9 ) 


This is the desired canonical form of the matrix J — ).E. Indeed, the 
leading coefficients of all polynomials on the principal diagonal 
of (9) are equal to unity and each of the polynomials is exactly 
divisible by the preceding one, by Condition (fi). 


Example. Let 


/ 


J- 


') 

1 

0 

u 

J 

2 

I 

1 

j 0 

0 

0 


0 \ 




.» 1 
() 




(I 



J 


lor tlii> JonliHi matrix nf ohUt 9. llip polyiHjiuial array (7) is of the form 

(X - 2)^, X - X ~ 2. 

(X - (X - :.i- 

I tlii? iiiv.iri.iiil lachirs of llir J matrix are the polynomials 

(X) = (X - 2)3 (X - 5)S 
(i^) = (X - 3) (X _ 
r: (X) =: (X _ 2) 
whereas ... 1. 


Now that we liave learned how. judging by the form of a given 
Jordan matrix ./ . to write down the canonical form of its characte- 
ristic matrix straightaway, we can prove the following theorem. 

Two Jordan matrices are similar if and only if they consist of the 
same Jordan subma/rices, that is to say. if they differ at most solely 
in the order of these submatrices on the principal diagonal. 
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Actually, the polynomial array (7) was completely determiiiecl 
by the set of Jordan submatrices of the Jordan matrix J and did 
not in the least reflect the arrangement of the Jordan submatrices 
along the principal diagonal of the matrix. It then follows that if 
Jordan matrices J and J' have the same set of Jordan submatrices, 
then they are associated with one and the same array (7) of polyno- 
mials and therefore tlie same polynomials (S). lluis. the characte- 
ristic matrices J — %E and J' — have the same invariant factors, 
that is to say. they are equivalent, and therefore the matrices J 
and J' are similar. 

Convorsolv* if tlio Jordan matrices J and J arc similai. then 
their characteristic matrices have the same invariant factors. Let 
the polvnoraials (8) for / = 1. 2, . . .. he those in\aiiant factors 
which are different from unity. But the polynomial array (7) can be 
restored from the polynomials (8). Namely, the polynomials (8) 
can be factored into a product of powers of linear factors, since, 
as has already been j)roved. this property is possessed by the inva- 
riant factors of the characteristic matrix of any Jordan matrix. 
Array (7) just consists of all tho.se maximal powers of the linear 
factors into which the polynomials (8) are factored. Finally, using 
array (7) we can restore tlu* Jordan submatrice.s of Ibc original Jordan 

matrices: to every polynomial (X - >./)''■> of (7) ti.ere corresponds 
a Jordan subinalrix of order that refers lo Ibo number ] hi> 
proves that the matrices./ and J' consist of the same Jordan suhmatri- 
ces and differ at most in their ordei' alone. 

One conseciuence of this theorem i.s that a Jordan matrix similar 
to a diagonal matrix is diagonal and that two diagonal matrices are 
similar if and nnlij if theij can he obtained from one afiothei by pet muting 

the numbers on the principal diagonal. .... 

Reducing a matrix to Jordan normal fonn# If a matiix A witli 
dements from the field P con be reduced to a Joidaii notmal form. 
i.c., is similar to a Jordan matrix, then, as follows from tlie theorem 
that was proved above, the Jordan normal form is detei mined uniquely 
for matrix A to within the order of the Jordan submatrices on the prin- 
cipal diagonal. The condition that allows a matrix A lo be so reduced 
is given in llie following theorem, the proof of wliich offers a prac- 
tical procedure* for finding a Jordan matiix .similai lo .1 if .such 
a Jordan matrix exists. Note lliat rediicibility over the held P mean.s 
that all the elements of the matrix undergoing transformation are 


in P. 

Matrix A with elements in the field P can be reduced over P to the 
Jordan normal form if and only if all the characteristic roots of A lie 

in the base field P itself. , , > . 

Indeed, if matrix A is similar to the Jordan matrix J. tlicse 
two matrices have the same characteristic roots. However, the cha- 
racteristic roots of J arc easily found: since the determinant of the 
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matrix J — kE is equal to the product of its elements on the prin- 
cipal diagonal, the polynomial |/ — kE | can be factored over P 
into linear factors and its roots are numbers (and only these numbers) 
on the principal diagonal of J. 

Conversely, let all characteristic roots of matrix A be in the 
field P. If the different-from-iinity invariant factors of the matrix 
A — • kE are 

{k), . . (X), (k) (10) 

then 

\A - kE\ = (-1)" (k) . . . {k) (k) 


Imlood. the determinants of the matrix A — kE and its canonical 
matrix can only differ in a constant factor that is actually equal to 
(— 1)", since .such, precisely, is the leading coefficient of the cha- 
racteristic ])olynomial ) .4 — kE |. Thus, among the polynomials 
(10) there are none equal to zero, the sum of the degrees of these 
polynomials is equal to u. and all can be factored over the field P 
into linear factors, which is due to the fact that, by hypothesis, the 
polynomial | A — kE | has .such a factorization. 

Let (8) he factorizations of the polynomials (10) into products 
of llie powers of the linear factors. We use the term elementary divi- 
.sor.s’ of //?(' polynomial 1, 2, . . .. q, for powers (diffe- 

rent from unity) of the various linear binomials entering into its 
factorization (8), that is. 


(A -?.,)’^ (A-y ..., 

W'e call the elementary divisors of all p(dynomials (10) the ele- 
mrntary divisors of thv matrix A and write them down in the form 
of array (7). 

Li t us now lake a Jordan matrix J of order n composed of Jordan 
>11 hni at ric(‘S deliiu'ii as follows; with each elementary divisor 

(a L,) a of matrix ,I we associate a Jordan submatrix of order 
referrinu to the numlier k,. It is evident that only tiie polynomial.s 
(Ifl) iire invariant factors, different from unity, of the matrix / — kE. 
ThiM-eh.re. matrices .1 ~ kE and / — kE are equivalent and, hence, 
matrix A is similar to the Jordan matrix /. 


I^xamplc. fiiveti a inalri.x 

( -16 -17 87 -10a\ 

S 9 -42 54 I 

-3 -3 16 -18 I 

-1 -I G -8/ 

Hcducing the matrix A — kE to canonical form in llie usual way, we fiud that 
llio invariant factors different from unity of this matrix are the polynomials 

c, (X) = (>, - 1)= (X ^ 2), 
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We see that matrix A can be reduced to the Jordan normal form even in the 
field of rational numbers. Its elementary divisors are the poly- 
nomials (X. — 1)2, X — 1 and X -{- 2 and so the matrix 

( 110 0 \ 

0 10 0 \ 

0 0 1 0 I 
0 0 0 - 2 / 

is the Jordan normal form of the matrix A. 

If we wanted to find the nonsingular matrix that transforms A to J, we- 
would have to make use of the remarks made at the end of Sec. CO. 

Finally, on the ba.sis of the foregoing results wo can prove tlio 
following necessary and sufficient condition for reducing a matrix 
to diagonal form, a condition that immediately yield.s the sufficient 
criterion of reducibility to diagonal form tliat was proved in Sec. 33. 

An nth-order matrix A with elements in the field P can be reduced 
to diagonal form if and only if all the roots of the last invariant factor 
(^) of its characteristic matrix are in P {there must be no multiple 
roots). 

Indeed, reducibility of a matrix to diagonal form is equivalent 
to reducibility to a Jordan form such that all Jordan submatricc.® 
have order 1. In other words, all elementary divisors of matrix A 
must be polynomials of degree one. However, since all invariant 
factors of the matrix A — “kE are divisors of tlie polynomial (a). 
the la.st condition is equivalent to all elementary divisors of the 
polynomial (X) havingdegree one, which is what we set out to prove. 


62. Minimal Polynomials 

Suppose we have a square malrix A of order n with elements iiv 
the field P. If 

/ (X) = aoX'‘ -1- ctiX''-' -1- a,,-,x + ctft 

is an arbitrary polynomial in the ring P IX), then the matrix 
/ {A) = c£o-d'‘ -r cciA''~^ 'h -f a^E 

is called the value of the polynomial / (X) for X = A. Nolo, in tliis 
respect, that the constant term of the polynomial / (X) is multiplied 
by the zero power of the matrix A , that is to say, by the unit matrix E. 
It can be verified readily that if 

/ (X) = (p (X) 4- 

or 

/ (X) = u (X) V (X) 

then 

/(A) = ^>iA) + i)- (A) 
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and, respectively, 


/ (.4) = u (A) V {A) 


If the polynomial / (X) is annihilated by the matrix A, that is, 

/ (^) = 0 

then .-1 will be called the matrix root or (where no confusion is pos- 
sible) simply the root of the polynomial / (X). 

Every matrix A serves as a root of some nonzero polynoniiaL^ 

We know for a fact that ail square matrices of order n constitute 
an /I’-dimensioual vector space over the field P. From this it fol- 
lows that the system of n- 4- 1 matrices 

is linearly dejiendent over P, that is, in P there are elements 
cio, cti an 2 +i, not all zero, such that 

Thus, matrix A proved to be a root of the nonzero polynomial 




n8-l 



-a>i2-i-i 


wlioso degree does not exceed rr. 

Tbe matrix A is also a root of certain polynomials whose leading 
eoeflieients are equal to unity: it suffices to take any nonzero poly- 
nomial that cun be annihilated by .1 and divide it by its leading 
cooiTieienl. 'Phe jiolynomial of lowest degree with leading coefficient 1 
llial can be annihilated by .1 is called the minimal polynomial of the 
matrix A. Notice that the minimal polynomial of .1 in uniquely defined, 
'iince tile difference of two such polynomials would have a lower 
degiee than eacii ono separately, but it would also be annihilable 
Ity tile matrix .1. 

Any polynomial f (X) that is annihilable by the matrix A is exactly 
ilirisibU' by (he minimal polynomial m (X) of this matrix. 

Acinally. if 

/ (X) =- m (X) q (X) r (X) 

where the degree of r (X) is less than the degree of ni (X), then 

/ (*4) - m (.4) q {^) -p r (.4) 


and from / (.4) = m (.4) = 0 it follows that r (.4) = 0, which 
contradicts the definition of a minimal polynomial. 

Let us prove the following theorem. 

The minimal polynomial of a matrix A coincides with the last 
Jnvariant factor (X) of the characteristic matrix A — X£‘. 
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Proof. Retaining notations and using the results of Sec. 59, 
we can write the equation 

(_!)- \A - IE\ = {}.) (A) (1) 

whence it follows, for one thing, tliat the polynomials (a) and 
(>v) are not zero polynomials. iNe.\l, denote by B (>.) the adjoint 


of the matrix A — kE (see Sec. 14). 

B (A) - {A ~ IE)* 

As follows from (3), Sec. 14, the equation 

(A - \E) B {k) = \ A - ).E I E (2) 

holds true. On the other hand, since llio elonienls of D (a) are (n — l)th 
order minors (with plus or minus signs) of the matrix A — ).E, 
and only these minors, and the polynomial (?•) is the greatest 
common divisor of all llu'se minors, it follows that 

B (X) = d, (A) C (X) (3) 


the greatest common divisor of the elonients of matrix C (X) being 
equal to 1. 

From equations (2), (3) and (1) follows the e(iuation 
{A - kE) dn-i (X) C (X) = (-1)“ (X) (X) E 

XVo can divide through by the nonzero factor .j (X), as follows 
from the general remark that if (X) is a nonzero polynomial and 
D (X) = {dii (X)) is a nonzero X-malrix llet d^i (X) 01. then the 
is, t) position in the matrix (p (X) D (X) will he occupied i.y the 
nonzero element (p (X) d^t (X). lhu.s. 

{A - kE) a (X) - (-1)-' (X) E 

whence 

(;v)/i =(X£'--l)I(-l)’'^i6'{X)l (4) 

Thi.s equation .shows that the lemaimler resulting from "left" 
division of the X-matrix in the left memher by the l)inomial kE — A 
is equal to zero. From tiie lemma proved at the end of Sec. GO it fol- 
lows, however, that this remainder is equal to the matrix (yl) E = 
= Cn (A). True enough, the matrix (k) E may J)e written as a mat- 
rix X-polynomial whose coefficients are scalar matrices, i.e., such 
fis commute with the matrix A. Thus, 

(A) - 0 

which is to say that the polynomial e„ (X) is indeed annihilaled by A. 

From this it follows that the polynomial (X) is exactly divisible 
by the minimal polynomial rn (X) of matrix A, 

(X) = m (X) q (X) 


(5) 
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It is clear that the leading coefficient of the polynomial q (X) is 
equal to unity. 

Since m (/I) = 0, then, by the same lemma of Sec. 60, the remain- 
der after left-division of the X-matrix m (?i) E by the binomial 
IE — A is again equal to zero; that is, 

m{X)E {XE -~A)Q {X) (6) 

The equations (5), (4) and (6) lead to the equation 

{XE - A) [(-1)'^^^ C {X)] - {XE - A) [Q (a) q (X)l 

The common factor XE — A can be cancelled out of both sides 
since the leading coefficient E of this matrix ?.-polynomial is a non- 
.siiigular matrix. Thus, 

C{X) = {-ir^ Q {X) q {X) 

We recall, however, that the greate.st common divisor of the ele- 
ments of matrix C (X) is unity. Therefore, the polynomial g (X) 
must be of degree zero, and since its leading coefficient is unity» 
q (X) = 1. Thus, by (5). 

(X) = m (X) 

which completes the proof. 

Since, by (1), the characteristic polynomial of matrix A is exactly 
divisible by the polynomial (X). there follows from the theorem 
just proved the Cayley-IIamilton theorem. 

Cayley-Hainilton Theorem. Eirnj matrix is a root of its characte- 
ristic polynomial. 

The minimal polynomial of a linear transformation. Let us 
lirst prove the following assertion. 

Ji matrices .1 and It are similar and if the polynomial f (X) is an- 
nihilated by matrix A. then it is also annihilated by matrix B. 

Indeed, let 

B - C-WC 
If 


/ (A) = 4- aj:-' ~ -r a, .A 4- Oft 


then 


n.uA 


k 




.r‘-> 


— aji = 0 


1 ritiisforming both side's ol ibis equation by matrix C, we get 




a/,.,. I - o!,,£) C 


- a„ -r a, {C-KACf-^ - . . . a,,., {C-MQ H- 

- cto/) * -J- -j- . . , -f- -f- ct/j£ “ ^ 

i.e. / {B) - 0. 
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From this it follows that similar matrices have one and the same 
minimal polynomial. 

Now let cp be a linear transformation of an n-dimensional linear 
space over the field P. The matrices that represent this transforma- 
tion in different bases of space are similar. The common minimal 
polynomial of these matrices is termed the minimal polynomial of 
the linear transformation (p. 

Using the operations (on linear transformations) introduced 
in Sec. 32, we can introduce the concept of the value of the polynomial 

/ (?.) - aX -r olX~^ -f . . . -1- -f- 

from the ring [XI for X equal to the linear transformation (p; this 
is the linear transformation 

/ ((p) = ao<P*‘ + “i^P**"* -r • • • 4- -r cthf 

where e is the identity transformation. 

We furthermore say that tlie polynomial / (X) is annihilated by the 
linear transformation (p if 

/ {(p) = o> 

where w is the zero transformation. 

If the reader takes into account the relationship l)et\yeen opera- 
tions on linear transformations and on matrices, it will be easy 
for him to prove tliat the minimal polynomial of the linear transfor- 
mation (p is that uniquely determined polynomial of minimum degree 
with leading coefficient 1 which is annihilated by the tratwformation (p. 
After that the results obtaiiu'd above, in particular the Cayley- 
Hamilton theorem, can be rephrased in the language of linear trans- 
formations. 
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63. Definition of a Group. Examples 

Rings and fields, which played so important a role in the previous 
chapters, are algebraic systems with two independent operations: 
addition and multiplication. However, there are many areas of mathe- 
matics and its application in which we very often encounter algeb- 
raic systems with only one algebraic operation defined. Thus, con- 
fining ourselves to examples that have already appeared in this 
book, we have the set of permutations of degree n (see Sec. 3) in 
which we defined the single operation of multiplication. On the 
other hand, the definition of a vector space (Sec. S) includes the 
addition of vector.*!, whereas multiplication of vectors was not 
defined (notice that the multiplication of a vector by a scalar does 
not satisfy the definition — given in Sec. 44 — of an algebraic opera- 
tion). 

Groups form the most important type of algebraic systems with 
a single operation. This concept has extensive applications and forms 
the subject of a whole science— the theory of groups. The present 
r 111 ,^ he re ^ a rd e d as an introduction to the theory of groups, 
iiichuling such elementary facts about groups as are needed by every 
mathematician and al.^o. at the end, a theorem that is not so ele- 
mentary. 

Let us agree, as is the custom in group theory, to call the algeb- 
raic ofieralioii at hand Tnultiplicoiion and to use appropriate symbo- 
lism. It will be recalled (see Sec. 44) that an algebraic operation is 
always a.^sumod to be valid and unique: for any two elements a and b 
of a given set the product ah exists and is a uniquely defined element 
of the set. 

A group is a set G with one algebraic operation that is associative 
(though not necessarily commutative); the operation must have 
an inverse. 

Because of the possible noncommutativily of the group opera- 
tion, the possibility of the inverse operation signifies the following: 
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for any two elements a and 6 in G there exist in G a uniquely defined 
element x and a uniquely defined element y such that 

ax = b, ya ^ b 

If a group G consists of a finite number of elements, then it is 
called a iinite group, and the number of elements in it is the order 
of the group. If the operation defined in G is commutative, then G 
is called a commutative group or an Abelian group. 

Some simple consequences follow from the definition of a group. 
On the basis of reasoning already given in Sec. 44, we can assert 
that the associative law permits speaking in unique fashion about 
the product of any finite number of elements of a group specified (due- 
to the possible noncommutativity of the group operation) in a de- 
finite order. 

Let us examine the consequences which follow from the existence 
of the inverse operation. 

Let an arbitrary element a be given in a group G. From tlie 
definition of a group there follows the existence in G of a uniquely^ 
defined element Cg such that aCq = a; thus, this element plays 
the role of unity (identity) when multiplied by element a. 
If b is any other element of G and if y is a group element satisfying 
the equation ya=6 (its existence follows from the definition of a group), 
we get 

b = ya = y (aea) = (ya) = be^ 

Thus, the element €„ plays (he role of a right-identity with respect 
to all elements of the group G, and nn( only with respect to the 
initial element a; we therefore denote it by e'. From the unambi- 
guousness implicit in the definition of the inverse operation follows 
the uniqueness of this element. 

In similar fashion, we can prove the existence and uniqueness 
in the group G of an element e" that satisfies the condition e'a = a 
for all a in G. Indeed, the elements e' and e" coincide since the equa- 
lities cV = e” and e"e' — e imply e" = e' , This proves that in any 
group G there is a uniquely defined element e satisfying the condition 

ae = ea — a 

for all a in G. This element is termed tlie unit {identity) clement 
of G and is ordinarily denoted by' the symbol 1. 

From the definition of a group there also follows the^ existence 
and uniqueness, for a given element a, of elements a and a " such tliat 

aa' = 1, a'a = 1 

Actually, the elements a' and a" coincide; from the equalities 

a^aa' =- a" (aa) = a'' i - 
a'aa' ~~ (a'a) a' — I •«* ~ a' 


CH. 14. GROUPS 


oS4 

follows a" ^ a'. This element is called the inverse element of a and 
is denoted by fl"*, that is, 

aa~^ — a~^a — 1 

Thns, every element of a group has a unique inverse element. 

From the foregoing equalities it follows that the inverse of the 
element a~^ is the element a itself. It is readily seen that the inverse 
of a product of several elements is the product of the inverses taken 
in the opposite order: 

{a^a^ . . . ^ — ^ ^n-1 * • ■ 

Finally, the unit element is its own inverse. 

To check whether a given set with one operation is a group is 
greatly simplified by the fact that in the definition of a group the 
requirement that tiiere be an inverse operation can be replaced by the 
assumption of the existence of a unit (identity) element and inverse 
elements (and only on one side, say, the right, and without any 
assumption about their uniqueness). This follows from the theorem 
which we will now prove. 

A set G with a single associative operation is a group if there is at 
least one element e in G with the property 

ae ~ a for all a in G 

and if among the right-identities there is at least one element e^ such 
that, relative to it. any element a in G has at lea.st one right-inverse a~^\ 

aa-^ = ('o 

I’rorif. Let be one of the right-inverses of a. Then 

'I’hnt is. aa~' — Multiplying both sides of this equation 

on the right by one of the elements tliat are right-inverse for 
we gel ac,) = e’ouc,,. whence a — eofl. since c,, is a right-identity of G. 
Tlius. the elennml Co also turns nut to he a left-identity of G. Now 
if ('I is an arbitrary right-identily, e.. an arbitrary left-identity, 
then fnmi the eqnalit ies 

r.y'i and c, 

there follows ci Co. i.e.. any right-identily is equal to any left-iden- 
tity. This completes the proof of the existence and uniqueness, in the 
set G. of a unit element (identity) which we denote (as before) by 1. 
Furthermore, 

fl~* = ■ 1 = a~^ao~'^ 

That is. a~^ — a~^aa~K where a~‘ is one of the right-inverses for a. 
Multiplying both sides of the last equality on the right by one of 
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the right-inverses of a"^, we get 1 = a~^a. i.e.. the element a * will 
also serve as a left-inverse of a. Now, if a~\ is an arbitrary right- 
inverse of is an arbitrary left-inverse, then from the equalities 

there follows =- a'l, which is to say. there follows the existence 
and uniqueness of the inverse a"* of any element a in G. 

It is now’ easy to show’ that the sot G is a group. Indeed, the equa- 
tions ax = b, ya = b will be satisfied, as is readily seen, by the 

elements 

X = a~^b, y = ba~^ 

The uniqueness of these solutions follows from tlie fact that if. say, 
axi = ax», then, multiplying both sides of this equation on the 
left by a-\ we get x, = Xj. The theorem is proved. 

We have already encountered the concept of an isomorphism : 
for rings, for linear spaces and for Euclidean spaces. This concept 
can be defined for groups as well, and it plays just as important 
a role in grouj) theory as it does in the theory of rings. Groups G 
and G' are termed isomorphic if a one-to-one correspondence can be 
established betw’cen them such that, under it. for any elements a, b 
in G and for the corresponding elements a , b in G , to the product ab 
corresponds the product a'b'. As in Sec. 46 (for the zero element and 
the inverse element of a ring), it may be showm that, given an iso- 
morphic correspondence between groups G and G , the unit element 
of G is associated w’ith the unit clement of G . and if a in G is asso- 
ciated with a in G' . then is associated with 

Passing now’ to examples of groups, we notice that if the opera- 
tion in the group G is called addition, then the identity (unit) ele- 
ment of the- group is zero and is denoted by 0, and in place of the 
inverse element we speak of the opposite element {additiie inverse) 
denoted by —a- , , , 

As a first instance of a group, note that, with respect to addition, 
^ny ring (and, in particular, any field) is a group, it is an Abelian 
S^oup. This is the so-callcd addilive fi^oup of a ring. This remark 
immediately yields a wealth of concrete examples of groups: the 
additive group of integers, the additive group of even numbers, 
additive groups of the rational numbers, the reals, the complex 
numbers, etc. Note that the additive groups of integers and of even 
fiumbers are isomorphic with each other, although the latter is only 
a part of the former: a mapping that associates with every integer /c 
an even number 2,k is one-to-one and, as can easily be verified, is 
even an isomorphic mapping of the former group onto the latter. 

No ring is a group with respect to multiplication because the 
inverse operation (division) is not always possible. The situation 

25-086 
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does not change if we pass from an arbitrary ring to a held, since 
division by zero does not hold in a field. However. let us consider 
the collection of all nonzero elements of a field. Since a field does 
not contain divisors of zero (that is the product of two nonzero ele- 
ments is also nonzero), it follows that multiplication is an algebraic 
operation for this set: it will be associative and commutative. The 
set of all nonzero elements of a field will be closed under division. 
Hence, the set of nonzero elements of any field is an Abelian group. 
It is called a multiplicative group of the field. Instances of such groups 
are the multiplicative groups of the rational numbers, the real 
numbers, the complex numbers. 

Obviously, all positive real numbers constitute a group with 
respect to multiplication. This group is isomorphic to the additive 
group of all real numbers: associating a real number in a with an 
arbitrary positive number a, we get a one-to-one mapping of the 
first group onto the second group; this mapping is an isomorphism 
due to the equality 

In In a In b 


Let us now lake the set of nth roots of unity in the field of com- 
plex numbers. In Sec. 19 we proved tliat the product of two nth 
roots of unity and also the inverse of an nth root of unity belong 
to this set of numbers. Since unity, quite naturally, belongs to this 
set and since multiplication of complex inunbers is associative and 
commutative, we find that the nth roots of unity constitute an Abelian 
group with respect to multiplication; it is a finite group of order n. 
Thus, for any natural number n there exist finite groups of order n. 

The group (irith respect to multiplication) of the nth roots of unity 
is isomorphic to the additive group of the ring Z„ constructed in Sec. 45. 
Indeed, if e is a primitive nth root of unity, then all elements of the 


first of these groups is of the form k = 0. 1, 


n 


— 1. If we 


associate willi every number an element Cu of the ring Zr,. i.e., 

ich X 1 C Id Ji as remainder upon divi.sion by n. 
we get an isomorphic correspondence between the groups under 
consideration: if 0 < A- < n — 1. 0 < / < n — I and if k I = 
nq r. whore 0 ^ ^ n — 1. and q is equal to 0 or 1, then 


- f’’ and. at the same time, Ch A- Ci -- Cr- 

At this point, it is worth indicating some numerical sets tliat 

are not groups. Thus, the .set of all integers is not a group witli 

respect to multiplication, the .‘^el of all po.sitive real numbers is 

not a group witli respect to addition, the .set of all odd numbers is 

not a group with respect to addition, the .set of all negative real 

numbers is not a group with respect to multiplication. All these 

assertions can easilv be verified. 

% 

All the numerical groups considered above are of course Abelian. 
Instances of Abelian groups not made up of numbers are the linear 
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spaces: as follows from tlieir dei'iiiition (see Secs. 20. 47), ajuj linear 
space over an arbitrary field P is an Abelian group with respect to the 
operation of addition. 

Let us now examine examples of noncommutalive groups. 

The set of all lali-order matrices over the field P is not a group 
with respect to the operation of nuilliplication since the demand that 
there be an inverse breaks down. However, if we confine our attention 
to nonsingular matrices, then we get a group. Indeed, the product 
of two nonsingular matrices is. as we know, nonsinguiar. tlie unit 
matrix is nonsingular; every nonsingular matrix lias an inverse 
which is also non.singular and. finally, the associative law, which 
holds for all matrices, liolds true in the particular case of nonsingular 
matrices. We can therefore sjieak of the group of nofUiingular matrices 
of order n over the field P with matrix multiplication as the group 
operation. This group is noncommutative for n 1. 

The multiplication of jiermulations introduced in Sec. 3 is 
a very important examiile of a finite noncommutative group. We 
know that in the set of all perinutalions of degree n multiplication 
is an algebraic operation which is a.'^sociative, although for n 3 
it is noncommutative. that the identity permutation E is the iden- 
tity of this multiplication and that every permutation has an inver.se. 
'I'hus, the set of permutations of degree n conslitutes a group with respeet 
to multiplication-, it is a finite group of order n\. This group is termed 
a symmetric group of degree n and is noncommutali\e for 

« > 3. 

In place of the .‘=et of all permutations of degree n. let us consider 
only the set of even permutations, which, as we know, consists 

of -In! elements. Using the theorem, proved in Sec. 3. liial the 

parity of a permutation coincides with the parity of the numher of 
transpositions entering into some decoinjiosition of this jiermiilation 
into a product of transpositions, we find that the product of two even 
permutations is even. Indeed, we obtain the repre.^mtation of AP 
as a product of traiispo.^ilions by writing the appropriate decompo- 
sitions of A and P one after the other. Furthermore, the a.ssociativity 
of multiplication of [uounutations is known, and the eNeiine.^s of the 
identity p(*rmutation is obvious. I'inali>'. the e\euness of the per- 
mutation for the even permutation A follows at least from the 
fact that the notations of thesi* permutations may be obtained one 
from the other by interchanging the ujiper and lower rows; that 
is to Say, they contain an etjual number of in\ersion.s. Ihus. the set 

of even permutations of degree n is a finite group of order - n\ with 

respect to multiplication. Thi.s group is called an alternating group 
of degree n. It is easy to verify that it is noncommutative for n > 4, 
although it j.s commutative for n = 3. 


25* 
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Symmetric and alternating groups play a prominent role in the 
theory of finite groups and also in the Galois theory. Notice that 
it would he impossible, by analogy with alternating groups, to con- 
struct a group of odd permutations with respect to multiplication, 
since the product of two odd permutations is always an even per- 
mutation. 

A large number of diverse examples of groups are found in the 
various branches of geometry. Just one simple example of this 
nature: the set of all rotations of a sphere about its centre is a group; 
it is noncommutative if we call the result of two successive rotations 
the product of these rotations. 

M. Subgroups 

A subset ^ of a group G is called a subgroup of this group if it 
is a group with respect to the operation defined in G. 

To find out whether a subset A of group G is a subgroup of G, 
it is sufficient to verify that: (1) the product of any two elements 
of .1 lies in A\ (2) A contains every element and the inverse of every 
element of A. Indeed, from the fact that the associative law holds 
in G it follows that it holds for elements in A\ the fact that the unit 
element of G belongs to A follows from (2) and (1). 

Many of the groups named in Sec. 63 are subgroups of other 
groups indicated there. For instance, the additive group of even 
numbers is a subgroup of the additive group of all integers, and the 
latter, in its turn, is a subgroup of the additive group of rational 
numbers. .Ml tliese groups, like the additive groups of numbers in 
general, are subgroups of the additive group of complex numbers. 
The multij)licativc group of positive real numbers is a subgroup 
of the multii)licative group of all nonzero real numbers. The alter- 
nating group of degree « is a sul)group of the symmetric group of 
the same degree. 

There i.'^ a point to .stress: the requirement contained in the 
definition of a subgro\ip that the sulwt A of group G be a group 
with respect to the group operation defined in G is essential. Thus, 
the multiplicative group of positive real numbers is not a subgroup 
of the additive group of all real number.'?, although the former set 
is a subset of the latter. 

If we take subgroups A and B in the group G, then their intersection 
A n hs, the collection of elements common to A and B. is also 

a subgroup of G. 

Indeed, if the intersection .1 B contains elements x and y, 
then they lie in the subgroup A and for this reason the product xij 
and the inverse x~^ belong to .-1 as well. Ry the same reasoning, the 
elements xij and belong to the .subgroup B and therefore they 
are contained in the intersection .1 f| B too. 
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It is readily seen that this result holds true not only for two 
subgroups, but for any number of subgroups, whether finite or even 

infinite. , , ^ - i • 

The subset of group G consisting of the single element 1 is obvio- 

usly a subgroup of this group. Ihis subgroup, which is contained 
in any other subgroup of G. is called the unit subgroup of group G. 
On the other hand, the group G itself is one of its own subgroups. 

An interesting example of subgroups are the so-called cyclic 
subgroups. Let us introduce the concept of the power of an element 
a of group G. If n is any natural number, then the product of n ele- 
ments equal to the clement a is called the nth power of the element 
a and is denoted by a". Negative powers of element a may be defined 
either as elements of group G inveri^e to the positive powers of this 
clement or as products of several factors equal to the element a 
These definitions actually coincide: 

(a")-‘ = (a-*)''. > 0 (1) 

To prove this, take the product of 2n factors, of which the first n 
are equal to a and the remaining ones are equal to a' . and perforin 
the cancellations. The element equal both to the left member and 
the right member of (1) will be denoted by Finally, let us agree 
to use the term zero power a® of element a for the . 

Note that if the operation in the group G is called addition, 
then in place of powers of a we .should speak of multiples of this 

clement and write ka. , , , 

It is easy to .show that in any group G. we have for the powers 

of any element a for any exponents m and n (positive, negative, 
or zero) the following equalities: 

= a'*"' (3) 

VVe denote by {a} the subset of G composed of all powers of 
the element a, including the element a itself as its first power. The 
subset {a} is a subgroup of the group G: multiplication of the elements 
of la) lies in la) by (2); {a} has the element 1, equal to a\ and. 
finally, {a} contains all its elements together with all the inverse 
elements, since from (3) follows the equality 

The subgroup {a} is called a cyclic subgroup of the group G gene- 
rated by the element a. As is evident from (2), it is always commutat ive. 
even if the group G itself is noncommutative. 

Notice that it has not been asserted above that all powers of the 
element a are distinct elements of the group. If this is indeed so, 
then a is called an element of infinite order. However, let there be. 
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among the powers of a, some which are equal, say, ~ a* for 
k 1; this is always the case for finite groups, but it may also 
occur in an infinite group as well. If k >> /, then 

a"-* = 1 

which is to say that there are positive powers of the element a that 
are equal to unity. Let ?i be the least positive power of the element a 
equal to unity, that is, 

(1) = 1, n > 0, 

(2) if = 1, ^ > 0, then k ^ ii 

In this case we say that a is an element of finite order, namely, of 
order n. 

If an element a is of finite order n, then all the elements 

1, a, (4) 

will be distinct, as is clearly seen. Any other power of the element a, 
whether positive or negative, is equal to one of the elements of (4). 
Indeed, if k is any integer, then, dividing k by n, we get 

k = nq r, 0 ^ r ^ 

and so, hy (2) and (3), 

a‘^ = {ay-a^ ---- a'' (5) 

Whence il follows that if the element a is of finite order n, and 
-- 1, then k must he exactly divisible hy n. On the other hand, 
since 

-1 n (-1) {n - I) 

it follows that for the element a of finite order n 

Since the .'System (4) contains n elements, it follows from the 
results obtained above that for element a of finite order its order n 
coincides with the order {that is to say, with the number of elements) 
of the cyclic subgroup («}. 

Finally, notice that any group has one and only one element 
of the first order: this is the element 1. Tlio cyclic subgroup {!) 
evidently coincides with llie unit .subgroup. 

Cyclic groups, A group G is called a cyclic group if it consists 
of the powers of one of its elements a, that is. if it coincides with 
one of its cyclic subgroups {n}; here, the element a is called the 
generator of the group G. It is obvious that every cyclic group is 
Abelian. 

An example of an infinite cyclic group is the additive group of 
the integers— any integer which is a multiple o** the number 1; 
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that is to say. tliis number serves as the generator of the group at 
hand. We could also take — 1 for the generator. 

An example of a fniile cyclic group of order n is the multipli- 
cative group of the /ith roots of unity; in Sec. 19 it is shown that 
all these roots are powers of one of them, namely, the primitive root. 

The following theorem sliows that. e.s.«enlially, the.«e examples 
exhaust all cyclic groups. 

All infinite cyclic groups arc isomorphic among themselves: all 
finite cyclic groups uj a given order n are also isomorphic among them- 
selves. 

Indeed, an inOnite cyclic group with generator a is mapped one- 
to-one onto the additive group of the integers if every element 
of this group is associated with the numher /.•; this mapping is iso- 
morphic. since, hy (2). in mullii^lying the powers of the element a 
we add the exponents. Now if we are given a (inile cyclic group G 
of order n with generator a. then we denote hy e the primitive nth 
root of unity and as.«ociale with every element r/' of group G. 0 < 
< /c < n, the numl)er Hiis is a one-to-one mai)ping of the grouj) 
G onto the multiplicative group of the nth roots of unity, the iso- 
morphic ju'ojierty of which follows from (2) and (h). 

This theorem* enables us to si»eak simi)ly about an infinite cyclic 
group or about a cyclic group of order n. 

Wo now [)rove the following theorem. 

Every subgroup of a cyclic gnmp is itself cyclic. 

Indeed, let G {a] he a cyclic group with generator a (intinite 
or finite) and let .1 l>e a suhgroU[) of G. We assume that A is diffe- 
rent from the unit subgroup, otherwise there would he nothing to 
prove. Suppose tiiat is the least positive i)ower of a contained 
in A. There is such a power, since if .1 contains an element 
s > 0, different from 1. then .1 also C(mtains the inverse element a\ 
As.sume that A also lias an element a^ . I - 0. and k does not divide 1. 
Then if d, d > 0. is the greatest common divisor of the numbers k 
and /, there exist integers u and v such that 


ku 


Iv = d 


and therefore the subgroup A must contain the element 

hut since under our assumptions d < k. we are in conflict with the 
choice of the element Ihis is proof that A = {«*'}• 

Decomposition of a group with respect lo a subgroup. If we 
take subsets M and N in a group G. then the product MN of these 
subsets is to he understood as the collectiiui of elements of G that 
are repre.senlahle in at least one way as the jiroduct of an element 
of M by an element of N. From tlie a.ssociativity of the group opera- 
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tion follows the associativity of multiplication of subsets of the group, 

{MN) P = M {NP) 

One of the sets M, N may of course consist of just the one ele- 
ment a. In this case we get the product aN of the element by the set 
or the product Ma of the set by the element. 

Suppose in G we have an arbitrary subgroup A. If a: is any ele- 
ment of G, then the product xA is called the left coset {of the group 
G with respect to the subgroup A) generated by element x. The element 
X naturally lies in the coset xA since the subgroup A contains a unit 
element, but x-i = x. 

Every left coset is generated by any one of its elements, that is to 
say, if an element y lies in the coset xA, then 

yA = xA (6) 

This is true because y may be represented as 

y = xa 

where a is an element of the subgroup A, Therefore, for any elements 
a' and a" in A it will be true that 

I/a' = X (aa'), 

xa" = y {a~^a") 

which ])roves (6). 

From this it follows that any two left cosets of the group G relative 
to the subgroup A either coincide or do not have any element in common. 
Indeed, if the cosets xA and i/.4 have a common element z, then 

xA ~ zA — yA 

Thus, the entire group G decomposes into disjoint left cosets 
relative to the subgroup A. This decomposition is called the left 
decomposition of the group G relative to the subgroup A. 

Note that one of the left cosets of this decomposition is the 
subgroup .‘1 itself; this coset is generated by the element 1 or, gene- 
rally, by any element a in .1, since 

ff*-l — *1 

Naturally, taking the product Ax as the right coset of the group 
G relative to the subgroup A — this coset being generated by the ele- 
ment x—\vo obtain, in similar fashion, a right decomposition of the 
group G relative to the subgroup A. For an Abelian group, both its 
decompositions (left and right) relative to any subgroup will natu- 
rally coincide, so we can simply speak of the decomposition of a 
group relative to a subgroup. 

For instance, the decomposition of the additive group of the 
integers relative to the subgroup of the multiples of the number k, 
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consists of k distinct cosels generated, respectively, by the numbers 
0, 1, 2, . . fc — 1. Here, the coset generated by the number /, 
0 ^ ^ A: — 1, contains all the numbers which upon division by 

k yield the remainder 1 . 

In the noncommutative case, the decompositions of a group 
relative to a subgroup may prove to be distinct. 

To illustrate, let us con.sider a symmetric group of degree 3. 5;,: 
as in Sec. 3, we write its elements as cycles. For the subgroup .1 
we take the cyclic subgroup of the element (12); it consists of tlie 
identity permutation and the permutation (12) itself. The other 
left cosets are: (13) -yl, consisting of the permutations (13) and (132). 
and (23) -.4, consisting of the permutations (23) and (123). On the 
other hand, the right cosets of the group S 3 relative to the subgroup 
A are: the subgroup A itself, the coset ^ (13). consisting of tlie 
permutations (13) and (123). and the coset yl-(23), consisting of 
the permutations (23) and (132). We see that in this case, the right 
decomposition differs from the left decomposition. 

For the case of finite groups, the existence of decomposition.® 
of a group relative to a subgroup leads to the following important 
theorem. 

Lagrange’s theorem. In every finite group, the order of any sub- 
group is a divisor of the order of the group itself. 

Indeed, in a finite group G of order n let there be given a sub- 
group A of order k. We consider the left decomposition of the group 
G relative to the subgroup A. Let it consist of / cosets; the number / 
is termed the index of the .subgroup A in the group G. Every left 
coset xA consists of exactly k elements, .®ince if 

xa^ = xa.> 

where Cj and are elements of A, then Aj = a.y. liiu.®, 

n = kj (7) 

which completes the proof. 

Since the order of an element coincides with the order of its 
cyclic subgroup, it follows from the Lagrange theorem that the 
order of any element of a finite group is a divisor of the order of the 
group. 

It also follows from the Lagrange theorem that any finite group 
whose order is a prime number is cyclic. 

Indeed, this group must coincide with tlie cyclic subgroup gene- 
rated by any element of it that is different from unity. 

Hence, by the above-obtained description of cyclic groups, 
it follows that for any prime p there is a unique, to within- 
Isomorphism, finite group of order p. 
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65. Normal Divisors, Factor Groups, Homomorphisms 


A subgroup A of a group G is called a normal divisor of this group 
{or an invariajtt subgroup) if the left decomposition of G with respect 
to A coincides with the right decomposition. 

Thus, all subgroups of an Abelian group are normal divisors 
in it. On the other hand, in any group G both the unit subgroup and 
the group itself are normal divisors: both decompositions of G with 
respect to the unit subgroup coincide with the decomposition of 

the group into separate elements, and both decompositions of the 

group G with respect to the group itself consist of the single coset G. 

Here are some of the more interesting examples of normal divisors 
in noncommulative groups. In the symmetric group of degree 3, 1S3, 
the cyclic subgroup of element (133) consisting of the identity per- 
mutation and the permutations (123) and (132) is a normal divisor; 
in both decompositions of the group S-.i with respect to this subgroup, 
the second cosel consists of the permutations (12). (13) and (23). 

OejKu’ally. in the symmetric group Sn of degree n the alternating 
gniup An of degree n is a normal divisor. Indeed, the group An 

is of order 4" ami st» any cosel of the group Sn with respect to the 


subgroup must consist of tlie same number of elements and, con- 
setjuenlly. tlunv is only one other such co.set. namely, the collection 
uf odd permutations. 

fn the multiplicative group of nonsingular square matrices of 
older n with elements in tlie held P. those matrice.s whose determi- 
nants equal 1 obviously constitute a subgroup. It will even be a 
imimal divisor, since the cla.<s of all matrices wiiose determinants 
are ecpial to the determinant of the matrix M is the co.set (simul- 
taneously left and right) with respect to this subgroup, which cosct 
is generated by the matrix M. It suffices to recall that in the mulli- 
pliralioii nf matrices llie delerniiiianls are multiplied together, 
'flu' ih'finition nf a normal divisor given above may be rephrased, 
suligroup .1 of gr()U|i G is a normal divisor of this group if 
for ativ eleimmt s in G 



rhal is to say, for any element x in G and an element a in A, it is 
possible to ciiooso elements a and a" in .-1 such that 

xo ~ a'x. a.r - xa" (2) 

There are otlier ilerinllions of a normal divisor equivalent to the 
original one. Tims, we call elements a and b of group G conjugate 
if in G there is at least one element x sucli tliat 


b ---- x~'^ax 


(3) 
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or we say that element b is the transform of element a hy x. From (3) 
it evidently follows that 

a = xbx~^ ^ (x"*)"* bx~'^ 


A subgroup A of a group G is a normal divisor in G if and onh/ 
if, together with any element of it, a, it also contains all elements con- 
jugate to it in G. 

Indeed, if >1 is a normal divisor in G. then, by (2). for (lie elemeiii 
a that wo cho.‘?e in A and for any element x in G we can Jind in A 
an element a” such that 


Whence 


ax — xa 
x~^ax — a 


That is, any element conjugate to a Iie.s in .1. Conversely, if a sul)- 
group A contains, together with any element a, all elements conju- 
gate to a, then in particular A also contains the element 

= a" 


whence follows the second of the equalities (2). For the .'^ame rea.'^on. 
A also contains the element 


-i\-i 




ax~^ = xax~^ ~ a' 


whence follows the first of the equalilie.-? (2). 

Using this re.’^ult, it is easy to prove that the intersection of auif 
normal divisors of group G will itself he a normal divisor of this group. 
Indeed, if A and B are normal divisors of G, then, as demonstratetl 
in the preceding section, the intersection .1 f) B is a subgroup of G. 
Let c bo any element of A f] B and x any element of G. 'Ihen the 
element x~^cx must lie botli in .! and B since both of these normal 
divisors contain the element c. Whence it follows that the element 
x~'^cx is in the intersection .1 f) B. 

Factor group. Fhe .significance of (he concept of a normal divi.sor 
is based on the fact that it is po.ssible. in a certain very natural way. 
to construct a new group from the co.^^ets with respect to a normal 
divisor — due to (1) tliere is no need in this case to distinguish between 
left and right coset.s. 

First notice that if A is an arbitrary subgroup of the group G. 
then 


AA = A 


( 4 ) 


J^ince the product of any two elements of llie subgroup .1 belongs 
fo A and. at the same time, by multi[i!ying all elements of .1 bv 
the unit element we already get the entire subgroup ,1. 

Let A now be a normal divisoi- of G. In this case, the product of 
any two cosets of G with respect to A fin the .sense of multiplying sub- 
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sets of the group G) will itself be a coset with respect to A. Indeed, 
using the associativity of the multiplication of subsets of a group, 
and using equality (4) and 

yA = Ay 


(cf. (1)1, we get 


xA • yA = xyA A = xyA 



for any elements x and y of G. 

Equation (5) shows that in order to find the product of two given 
cosets of group G with respect to the normal divisor A, we must 
choose in arbitrary fashion one representative in each coset (recall 
that every coset is generated by any one of its elements) and take 
the coset containing the product of these representatives. 

Thus is defined the operation of multiplication in the set of 
all cosets of the group G with respect to the normal divisor A. We 
will show that all the requirements that enter into the definition of 
a group are thus fulfilled. The associativity of multiplication of 
cosets follows from the a.ssociativity of the multiplication of sub- 
sets of the group. The role of the unit element is played by the 
normal divisor A itself, which is one of the cosets of the decomposi- 
tion of G witli respect to A: namely, by (4) and (1). it is true that 
for any x in G, 

X--1 • A = I , A ■ xA = xA .'1 = X- 1 


Finally, the coset x~^A is the inverse of the coset x/1 since 

xAx-K'\ = l-.l = A 


The group thus constructed is called the factor group of the group 
G with respect to the normal divi.«or .1 and is denoted G'A. 

We see that every group is associated with a whole set of new 
groups— its factor groups witli re.^pect to different normal divisors. 
Here, the factor group of the group G with respect to the unit .sub- 
group will, naturally, be isomorphic to G itself. 

furnj jorlnr group G A of an Abelian group G is itself Abelian, 
.since fr(Mii .nj -- tj.r it follows that 

.r-l-f/.d = xyA - //x.l -- yA -xA 

Every jaclur group G. A of a cyclic group G is cyclic, because if G 
is generated by an element g, G {g}, and if wo are given an 
arbitrary coset x.l. then there is an integer k such that 

X = g'* 

and so 

X.4 = {g*4)'‘ 

The order of any factor group GIA of a finite group G is a divisor 
of the order of the group itself. Indeed, the order of tlie factor group 



65. NORMAL DIVISORS, FACTOR GROUPS. H OMOMORPHISMS 


397 


CIA is equal to the index of the normal divisor A in the group G, 
and so we can take advantage of (7) of the preceding section. 

Here are some instances of factor groups. Since, in the additive 
group of the integers, the subgroup of multiples of the natural num- 
ber k has, as shown in the preceding section, index A*, the factor 
group of our group with respect to this subgroup is a finite group 
of order k\ it is a cyclic group because the group under consideration 
is itself cyclic. 

The factor group of a symmetric group Sn of degree n with respect 
to an alternating group A^ of degree n is a group of order 2; because 
2 is prime, it is a cyclic group (see the end of the preceding section). 

We have already given a description of the cosets of the multi- 
plicative group of nonsingular matrices of order n willi elements 
in the field P with respect to the normal divisor composed of matrices 
whose determinants are equal to 1. From this description it follows 
that the corresponding factor group is isomorphic to the multiplica- 
tive group of nonzero numbers of P. 

Homomorphisms. The concepts of a normal divisor and a factor 
group are closely connected with the following generalization of 
the concept of an isomorphism. 

A mapping (p of a group G onto a group G’ such that to every 
clement a of G there corresponds a unique element a' = a(p in & 
is called a homomorphic mapping of G onto G' (or simply a homo- 
morphism) if in this mapping every element a' of G‘ is an image of 
some element a in G, a' = fl(p, and if for any elements a, b of G, 

(a6) (p = arp'b(p 

It is quite obvious that if we also required a one-lo-oneness 
of the mapping cp. we would obtain tlie already familiar definition 
of an isomorphism. 

If (p is a homomorphism of group G onto group G' and 1 and a are, 

. respectively, the unit element and an arbitrary element of G and, V 
is the unit element of G' , then 

icp = r, 

(a-*) (p = (flrp)"^ 

Indeed, if l(p = e' and x' is an arbitrary element of the group G' , 
then there is an element x in G such that xip = x'. Whence, 

x' — x(f> = (xl)(p = x(p-l(p = x' -e' 

Similarly, 

x' = e'x' 

and, hence, e' = V. 

On the other hand, if (a“') (p = b', then 

= l(p = {aa~^) cp = a(p (a"*) cp = acp-6' 
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and, similarly, 

1' = h' -flip 

whence b' — (flip)"*- 

Let us use the term kernel of a homomorphism (p of a group G 
onto a group G' for the set of elements of G which are mapped under 
(p into the unit element 1" of G'. 

The kernel of any homomorphism (p of a group G is a normal divisor 
of G. 

Indeed, if the elements a, b ol G enter into the kernel of the 
liomomorphism (p. i.e., 

rtfp = &q) — r 

then 

(ab) (p nrp-fcff = I'-V = 1' 

That is to say. tlie product ab is also contained in the kernel of the- 
lioniomorj)hi.‘im (p. On the other hand, if crip = 1', then 

(«“*) (p — ~ 

which is to say that a~^ is also in the kernel of the homomorphism (p. 
Finally, if Ofp = T, and a- is an arbitrary element of the group G, 
then 

(./•■V/.r) cf ^ {.r"*) rp -fljf -.rep — (j(p)“' • 1' -xip = 1' 

rii(‘ kernel of th«‘ liomomorphism under consideration turned out to 
he a suligroup of the group G, whicli contains all the elements con- 
jugate to any one of its elements; hence, it is a normal divisor. 

Now let .1 he an arbitrary normal divisor of the group G. Asso- 
ciating every element x q[ G with that coset j--4 with re.spcct to the 
normal divisor A in which llie element lies, wo obtain a mapping 
of the group G onto the entire factor group G A. From the delinition 
of multiplication in the group GA Isee (5)1, it follows that this 
mapping is homomorphic. 

riie resulting honiomorphism is the canonical homomorphism 
of the group G onto the factor group G A. The normal divisor .1 
is itself oliviously the kernel of this honiomorphism. 

From this it follows that only the normal divisors of the group 
G sio've as kernels of the homomorphisrns of this group. This result can 
he regarded as yet another dermilion of a normal divisor. 

It appears that all groups onto which the group G can be homo- 
inorphically mapped are actually exhausted by the factor groups 
of this group, and all the liomomorphisms of G are exhausted by its 
canonical homoinorphisins onto its factor groups. To he more precise, 
the following theorem holds. 

Tlioorom on homomorphisrns. Suppose we have a homomorphism 
<1 of a group G onto a group G'; let A be the kernel of this homomorphism . 
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Then the group G' is isomorphic to the factor group G A : there exists 
an isomorphic mapping a of the former of these groups onto the latter 
such that the result of the successive mappings q and o coincides with 
the canonical homomorphism of the group G onto the factor group G'A. 

Indeed. let x' be an arbitrary element of G'. and let x be an ele- 
ment of G such that .rep = x. Since for any element a of the kernel 
A of the homomorphism ep we have the equality aq — 1'. it follows 
that 

(xa) (j -aef' = x' -I' -- x 


That is, all elements of the cosel j.l are inaj)jied under q into the 
element x . 

On the other hand, if c is any element of the group G, such that 
2 (p = x', then 

{x~^z) <p = j-“hpc<p ~ (jq )■* •::(( = ' V 


That i.s to say, x~^z is eonl air»ed i ti (he kernel .1 of (he homo morph ism 
<p. If we set x~'z - a. (hen 3 - xa. or (he element 3 is contained in 
the coset xA. Thus, collecting all (he elements of the group G which 
are mapped under the homomorphism (p into (he lixed element x' 
of the group G' , we get precisely tlu- Cijset avl. 

The correspondence a. which associates every elemenl x' of G' 
with that co.set of G by the normal divisor A wliich consists of all 
elements of G having a' as its image under (p. is a one-to-one mapping 
of the group G’ onto the group G A. This mapping o is an isoinorj)hism 
since if 


xo 


xA. >f'a uA 


that 


then 


and 


IS, 




X 


yn y 


Uy) ii xn--yi\ - xy 


so 


{xij')G xyA xA-ifA - x'aij'a 
Finally, if x is an arbitrary element in G and xi\> — x' then 

(xcp) a x'a -- xA 

That is, a successive execution of the homomorphism q and (he iso- 
morphism a indeed maps the elemenl x into (he cosel xA generated 
by it. The theorem is proved. 


b6. Direct Sums of Abelian Groujis 

We would like to conclude this chapter with a group-theoretic 
theorem that is deeper than the elementary properties of groups given 
above. Namely, proceeding from the description, given in Sec. 04, 



400 


CH. 14. GROUPS 


of cyclic groups, we will obtain in the next section a complete des- 
cription of finite Abelian groups. 

As is customary in the theory of Abelian groups, we use the 
additive notation for the group operation; we shall speak of the 
sum a -f 6 of elements a and h of the group, of the zero subgroup 0, 
of the multiples ka of some element a, etc. 

We will examine in this section a construction that will be 
described in detail in application to Abelian groups, though it could 
have been introduced at once for arbitrary (that is, not necessarily 
commutative) groups. This construction is suggested by the follow- 
ing examples. A plane regarded as a two-dimensional real linear 
space is an Abelian group with respect to the addition of vectors. 
Any straight line in this plane passing through the coordinate origin 
is a subgroup of the indicated group. If Ai and are two distinct 
straight lines of this kind, then, as we know, any vector in the plane 
that issues from the origin is uniquely represented by the sum of its 
projections on the straight lines A^ and Ag. Similarly, any vector 
of three-dimensional linear space can be uniquely written as the 
sum of three vectors belonging to three given straight lines Ai, A*, 
and A3, provided the lines do not lie in the same plane. 

.\n Abelian group G is called the direct sum of its subgroups Ai, 

G = Ai -f- A, 4' . ■ • “f Aft (1) 

if every element x of G is uniquely written as the sum of the elements 
^1, a.>, . . .. flft, taken, respectively, in the subgroups At, A2, . . .,Aft 

X = fli T ^2 “T" • • • “T flft (^) 


The notation (I) is called the direct decomposition of the group 

G. the suhgroup.s A^, j = 1, 2 Ar, are direct summands of this 

tleromposition, and the element in (2) is a component of the ele- 
ment X in the direct summand A^ of the decomposition (1), i — 
1. 2 k. 

If we are given a direct decomposition (1) of a group G and if the 
direct summands A-, of this decomposition {all or some of them)', 0 ^^ 
themselves decomposed into a direct sum. 



it 
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the?i the group G is the direct sum of all its subgroups: 






Indeed, for an arbitrary element 2: of G we have the notation (2) 
relative to the direct decomposition (1), and for each component 
i = 1, 2. . . ., k, we have the notation 



it 
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relative to the direct decomposition (3) of the group Aj. It is clear 

that X is the sum of all the elements Ojj, ; = 1. 2 A:/, i = 

= 1, 2, . . At. The uniqueness of this notation follows from the 
fact that we must obtain precisely equality ( 2 ) by taking any nota- 
tion of the element j- as a sum of elements, taken one each in the 
subgroups Ajj, and by adding the summands belonging to the same 
subgroup Ai, i = iy 2, . . k. On the other hand, each element Oj 
only has one notation of the type (4). 

The definition of a direct sum may be restated. First let us intro- 
duce a new concept. If it is given that an Abelian group G has certain 

subgroups Bi, . . ., 5^, then we denote by {5,, B^ Bi) the 

set of elements ij oi G which can in at least one way be written as 
a sum of the elements bi. taken in the subgroups i?,. ... 

. . ., Bi, respectively, 

y = foj + 6.> (^) 

The set {Bi, B.,y . . Bi] will he a subgroup of G. We say that 

this subgroup is generated by tlie .subgroups Bt. B., Bi. 

For the proof, let us take in {/^,, Bn Bi) an element y 

with notation ( 5 ). and also an element y' with a similar notation. 

y' = b\ -T b '2 + b\ 

where b\ is an element in Bi, t = 1, 2. . . ., /. Tlieii 
y + y' = (bi b[) -r {b 2 -r ( 6 , -f- b\), 

—y = {—hi) + (—^ 2 ) 4- . ■ . + i—ht) 

which is to say that the elements y + //' and —y also have at least 
one notation of the type (5) and. hence, belong to the .set {/i,. Bn. . . . 

• . ., Bi], which completes the proof. 

The subgroup {Bi. Bn Bi} contains each of the subgroups 

Bi, i = iy 2, . . ., L Indeed, every subgroup of the group G con- 
tains the zero element of this group and ^o. taking, for instance, 
in the subgroup Bi any element 61 . and in the subgroups B.>, . . ., Bi 
the element 0 , wo obtain the following notation of type (5) for cle- 
ment bi’. 

bi = 6 i + 0 + • • ■ 0 

An Abelian group G is the direct sum of its subgroups A xy An, . , A 
if and only if it is generated by these subgroups, 

G — {Ai, An, . . Ak) (0) 

and if the intersection of each subgroup Af, i = 2 , . . ., /c, with the 
subgroup generated by all preceding subgroups Ai, An, . . .. 
contains zero alone: 

{^Ai, A 2 , ■ ■ •, n ■^1 “ O' i — 2, , , k (7) 
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Indeed, if the group G has the direct decomposition (1), then 
for any element a: of G the notation (2) exists, and therefore we have 
equation (6). The validity of equations (7) follows from the uniqueness 
of the notation (2) for any element x: if for some i the intersection 
{^ 1 , Ao, . . ^i-i] n contained a nonzero element x, then, 

on the one hand, x could be wTilten as an element in Ai, i.e.. 
.r = and so 

X = 0 n- . . . T- 0 -i- a; 4- 0 r . . - + 0 (8) 

Oil the other hand, j*, as an element of the subgroup {,4i, A^, . . 
would have a notation of the form 


X a { i * • • ' ^i-l 

which is to say that 






It is evident that (8) and (t)) are two distinct notations of type (2) 
for the element x. 

Conversely, let (6) and (7) hold. From (6) it follows that any 
element j of G ha.s at least one notation of type (2). However, let 
there bo two distinct notations of type (2) for some element x: 



X ~ 

O 2 ■ 

T • • • T — ^7j •— "T . . . — 

(10) 

'I'hen 

wc can liiid 

an 

/, i ^ A*, .such lhal 


liiit 

Oh = 

• 0 ],. 

Oh-I — (hi- I, • • . , 0/. 1 — Gi 1 

{ID 

That 

is. 


Oi =^a'i 





(ij — a'i ^0 

(12) 


Imoiu (10) ami (11) follows, however, the equality 

(7; — fii ^ {«! ffi) - (cu —■ ffo) -f- . . . r («i-i — rti-i) 


wfii.h eoritradiets (7) due to (12). The theorem is proved. 

The coneejit of a direct sum may lie regarded from quite a diffe- 
rent angh‘. Su[)|M)se we have A* arbitrary Abelian groups Ai, Ao, . . . 

• . A 1 ^ among which thei'e may he isomoiqdiic groups. Denote 
by G tile set of all possible system.s of the form 


(«I> Oh) (13) 

composed of elements taken one at a time in each of the groups 

-4i, A.y, . , .1;,. J lie set G will become an Abelian group if addi- 
tion of tile systems of type (13) is defined by the following rule; 

(fli, a^, . . ., fl/j) ■■■ (Uj. o„, . . .. flj.) 

■’T' 77|. O 2 0|^ ■■ (7/,) (14) 
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That is, the elements are combined separately in eacli of the given 
groups yli, Ao, . • .. Af,. Indeed, the associativity and commutativity 
of this addition follows from the validity of these properties in 
each of the specified groups; the role of zero is played hy the system 

(0„ 0 .,, . . 00 


where Oj denotes the zero element of the group /I/. / - 1.2, . . A\ 
The inverse of (13) is the system 

The Abelian group G thus constructed is called the direct sum 
of the groups Ai, A^ A,, and is written, as above. 

G = -f Ag Aft 


This name is justified hy the fact that the group G, which is the direct 
sum of the groups A,, A., Aft in the sense fust defined, can be de- 
composed into the direct sum of its .subgroups Aj. A, Aft. which 

are isomorphic, respectively, to the groups At. A> Aj,. 

Namely, denote hy Al / = 1. 2 k. the set of elements of G. 

that is systems of type (13). with an arbitrary element Oj of group 
Aj in the ith po.sition, all other positions being occupied hy zeros 
of the corresponding groups; tliese will thus he systems of the form 



The definition {\^i) of addition shows that the .«et A\ is a subgroup 
of the group G. We obtain the i.<oinorphism of tliis subgroup and the 
group A j by assoc ia t ing to each sysleiii (13) an element a, of grou[) .^1 
It remains to prove that the grou|) G is the direct sum of the 

subgroups a;, a; Al Indeed, any elenu'nt (1.3) of G may he 

represented as a svjiii of elements of the indicated subgroups: 

(flji flft) (^J* 1^2’ • • -1 fl/<) 

+ (Oj, flnj ^3 “I ■ • ■ : ^2' ' • •’ -e ^h) 

I'he uniqueness of tliis represenlal ion follows from the fact that 
distinct systems of type (13) are distinct element.^ of the group G. 

If we have two systems of Abelian groups, .li. -1 ^ A], and 

Bu, and the groups A , and />*, are isomorphic, i ~ 1.2.... 

• • k, then the groups 

G — Ai ~\~ A., ... * A ft 


and 


li = Bx H H-i r . . . Bu 


are also isomorphic. 

Indeed, if for z - 1. 2 k there is 

groups Af and Bf. an isomorphism ((/. which 


established, between 
associates with each 
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element ai of an element Ojq); of B,, then the mapping (p, which 
associates with every element (oi, flo, . . -i Oft) of G an element of H 
definetl by the equation 

(Ci, Co, . i. Oft) tp = ^2<P2' • - •» 

will obviously be an isomorphic mapping of the group G onto the 
group H. 

If we have finite Abelian groups A iy A 2 } ^ A^ of orders ni^ 

. . tih, respectively, then the direct sum G of these groups is also a 
finite group and its order n is equal to the product of the orders of the 
direct summands, 

n = RiHo • • • rih (f6) 

Quite true, since the number of distinct systems of type (13) 
whose element <7, can assume distinct values, whose element «« 
can assume 77 .. distinct values, and so on, is determined by equa- 
tion (16). 

Let us consider some examples. 

If the order n of a finite cyclic group {n} can be decomposed into 
the product of tiro relatively prime natural numbers, 

n — st, {s, 0 — ^ 

then the group {a} can be decomposed into the direct sum of two cyclic 
groups having orders s and i, respectively. 

Let us use the additive notation for the group {a}. If we set 
b = ta, then 

sb — {st) a = na ~ 0 

but for 0 < A* < 5 

kb = {kt) a ^0 

which i.'j to say llial the cyclic sultgroup {b} is of order s. Similarly, 
tlic cyclic siihgnui]! {r} of cleinenl c = sa has order t. The inter- 
section (/)} n {c) conlains only zero because if kb — Ic for 0 ■< A* •< 
<: s, 0 < / /. then 

[kt) a = {Is) a 

whence, since the mnnbers kt and Is are less than n, 

kt ~ Is 

which is inipo.^sible due to the relative primality of the numbers s 
and t. Finally, there are numbers u and v such that 

su -h tv = i 

and so 

a = V {ta) -h u {so) = vb + uc 
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and, consequently, any element of the group (a) may be represented 

as the sum of elements of the subgroups {b} and {c}. 

We call an Abelian group G indecomposable if it cannot bo de- 
composed into the direct sum of two or several of its subgroups dis- 
tinct from the zero subgroup. A finite cyclic group whose order 
is some power of the prime number p is called a primary cyclic group 
relative to the prime number p. Applying several limes the assertion 
proved above, we find that any finite cyclic group can be decomposed 
into the direct sum of primary cyclic groups relative to distinct prime 
numbers. More precisely^ cyclic group of aider 


n 


p>i- 




where p„ p p,nre dixtinct prime m.mbere. can be decomposed 

into the direct sum s of cyclic groups having orders pp. Pa’ P, - 

respectively. . , 

Every primary cyclic group is indecomposable. 

Indeed, suppose wo have a linile cyclic , x Rf 

where p is prime. U this group wore decomposable, then. b> / . 
it would have nonzero subgroups whose intersection ‘^ ^o. Actual- 
ly, however, every nonzero subgroup of our group o 
zero element 

To prove this, take an arhilrary nonzero clement x of our group. 


X — sa. 


0 


.9 


The number s may be written as 

, = p's', O^Kh 

where the number s' is not divisible by P 
prime to it; and so tiiere exist iiuniliers u am v 

s'u ‘ pl^ ^ 

Then 

X = (p'‘-'-‘us) a ^ ^ (p'‘-‘u5') « 

= p*"* ( 1 - pc) a -- (p’'-‘ - p“'') « ■- 

which is to sav the element h is in the cyclic subgroup {x}. 

The additive' group of the integers {which is an infinite cyclic group) 
and aL 7e%flZ^rLp of all rational numbers are indecomposable 

^™The indocomnosobilitv of hotli these groups follows from the 
fact that in eacrof them there exists, for any Iwo nonzero elements, 
a nonzero common multiple; that is, any two nonzero cyclic sub- 

groups have a nonzero intersection. 
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Note lhal if the operation in an Abelian group G is termed mul- 
tiplicalioii, then instead of a direct sum wc speak of a direct product. 

The multiplicatiue group of nonzero real numbers can he decom- 
posed into a direct product of the multiplicative group of positive real 
numbers and a group, with respect to multiplication, made up of the 
numbers 1 and —1. 

Actually, the intersection of these two subgroups of our group 
contains only the number 1— the unit element of this group. On the 
other hand, every positive number is the product of the number 1 
by itself, every negative number is the product of its absolute value 
by the number —1. 


67. Finite Abeiian Groups 

If we take any finite set of primary cyclic groups, some of which 

can refer to one and tlie same prime number or even have the same 

order, i.e., he isomorphic, then the direct sum of these groups is 

a finite Abelian group. It turns out that this exhausts all finite 
Abelian groups. 

Fundamental theorem of finite Abelian groups. Every Unite 
Abelian group G which is not a zero group can be decomposed into a 
direct sum of primary cyclic subgroups. 

We begin the proof of this theorem with the remark that in (he 
group G there will inevitably he nonzero elements of prime power orders. 
Indeed, if some non/er.) element x oi G has order 1. lx ^ Q and if 
p . k 0, is a pow(‘r of llie prime p such that divides the number /. 

/ - p''m 

then the element mj- is different from zero and has order 
Let ^ 


p{y P2^ 


(1 


be all distinct prime.s, .some powers of which .serve as the orders of 
certain elements of the group G. Denote anv such number by p 
and the .set of elements of G having powers of p as their orders by P- 
I He .set r IS a .subgroup of the group G. Indeed. P include.s' the 
element 0 since its order is 1 p". Fiirthormore. if = 0. then 

• <* I- I * 


p (-.r) -0 as well. Hinally, if /,r 0, = 0 and if, 

k /. tlien 


sav. 


P" (-r 0) •• 0 


Thus, either the number p'' or a divisor of tliis number, at any rate 
some power of p, serves as the order of tlie element x + w 

Alternately taking each of the numbers (1) for p. we obtain .1 
nonzero subgroups 


Pv P.. 


( 2 ) 
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The group G is the direct sum of these subgroups, 

G - Z’, -{- ^^2 + . - . + (3) 

True, for if x is an arbitrary element of G, then its order I can 
only be divisible by certain prime numbers of the system (1), 

I = pf . . . p^ 

where > 0. i = 1. 2 s. Therefore, as was demonstrated 

at the end of Sec. 66, the cyclic subgroup {x} can be decomposed into 
the direct sum of primary cyclic subgroups having orders pf*. p^=. ... 
. . ., p^, respectivejy. These primary cyclic subgroups lie in corres- 
ponding subgroups (2) and. consequently, tlie element x is repre- 
.sented in the form of a sum of elements taken one each in all or 
several of the subgroups (2). This proves the equality 

G = {/*!, P», . • *1 Ps) 

which is similar to (6) of Sec. 66. 

To prove the equality similar to (7) of the same section, take 
any i, 2 < i < s. Then any element y of the subgroup (P,. /\. . . . 
- . .. Pj-j> is of the form 

y = fli + flj -f • • • H~ -1 

where the element Oj, ; = 1, 2, . . .. i — 1, is in the subgroup Pj, 
that is, has order p^-T Then, 

. . . pN-/) i/ = 0 

For the order of the element y we have .some divisor of the number 
p?> pf 2 . . . p*i-^> and, consequently, the element y. if it is diffe- 
rent from zero, cannot be in the subgroup P,. This proves that 

{/*„ P., . . n Pi - 0 

which is what we .set out to prove. 

Notice that an Abelian group, the orders of ail tlie elements of 
which are powers of one and the same i)riine number p, is termed 
primary relative to p. Primary cyclic groups are a special case of 
l^riraary groups. Thus, liie subgroups (2) are primary. They are called 
primary components of tin* group G. and the direct decomiJosition (3) 
is called the decomposition of this group into primary components. 
Since the subgroups (2) are defined uniquely in the grouj) G, it follows 
that the decomposition of G into primary components is likewise defined 
uniquely. 

Quite naturally, the decomposability of any finite Abelian group 
into the direct sum of primary groups reduces the jiroof of the fun- 
damental theorem to the case of a finite primary Abelian group P 
relative to some prime number p. Let us consider this case. 
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Let flj be one of the elements of the group P having the highest 
order in it. Furthermore, if in P there are nonzero elements, the 
j intersection of the cyclic subgroups of which with the cyclic sub- 
group {(Tj} is zero only, then by we denote one of the elements of 
the highest order among the elements with this property; thus, 

{fli} n {ffj = 0 

Let the elements be already chosen. Denote 

V * ■ *’ subgroup of the group P generated by 

their cyclic subgroups: 

{{aj, {Uo} , . - {a/-i}} = {flj, a.,, . . a._J (4) 

It evidently consists of all the elements of P that can be WTitten as 
the sum of multiples of the elements n.,, . . a. We will say 
that this subgroup is generated by the elements Oi, a„, . . a, _i. 
Let us now denote by one of the elements of the highest order 
among (hose elements of P whose cyclic subgroups have a zero 
intersection with the subgroup a., . . o,. Thus 

{oi, Ooi • ■ M ni_,} f| {aj = 0 (5) 

Hecanse of the liniteness of (he group />. this process must ter- 
minate. Suppose this occurs after the elements n,. a. ^3 have 

been chosen, [f by P we denote the subgroup generated by these 
elements. ^ 

^ {^1) *^2’ • ■ •> fli) 

i.e., 


= {{«i}. {a2}» 


• • 




Irr^II' ■' ‘'.vclic siib^ri-uup of ally nonzero element of the 

(,roi p / liai. a nonzero inicr.seclion with the siihproiip P’ . 

(';) till’ equality (5), which holds true for 

It' ■ ■’ '“'"’'y *’>■ ('•)■ thf subgroup P' is the direct 

Miin of the cyclic subgroups {«,), {„,), . . ,, 

P' = {“i) {«.} -b . . . -r {«,} (7) 

I'brpnfire I’™'® '**<= suligroup P' docs indeed coincide with 

tiie entire group P. 

Let r be any element of P having order p. Since 

p' n w =^0 

and the subgroup {j} has no nonzero subgroups different from it- 
self recall that the order of a siibgronii is a divisor of the order 
of the group, and the number p is prime-lho subgroup {x} is indeed 
contained in the subgroup /" and, hence, x belongs to >. Thus, all 
elements of order p of the group P lie in the subgroup P' . 
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Now suppose it has been proved that all elements of P whose 
order does not exceed the number are in the subgroup P' , and 
let X be any element of P having order p'*. As the choice of the ele- 
ments fli, • • - Os shows, their orders do not increase and so we 
can indicate an 1 ^ t — 1 ^ 5 , such that the orders of the elements 


0\, a^, . . are greater than or equal to p*, and for i — 1 ^ ^ 

the order of the element o,- is strictly less than this number, that 
is to say, less than the order of the element x. Whence it follows, 
by the conditions to which the choice of the clement o; are subiect 
that if < J • 

Q = {«!• a. 

then 

Q n W ^0 

However, in Sec. 06 it was proved that any nonzero subgroup 
of a primary cyclic grouj) {a:} of order p'* contains the element 

y = (8) 

Consequently, the element y lies in the intersection Q f] and 
therefore in the subgroup Q as well. This enables one to write 1/ 
as the sum of multiples of the elements Oj, a» 

y = -i- La. r . . . -f (9) 

From (8) it follows that the element y has order p. Therefore, 

(pli) a, + (pk) o. (pli-i) = 0 

That is to say, because of the existence of the direct decomnosi- 
tion (7), 

(plj) Oj - 0, / = 1, 2 i — I 

The number plj must thus be divisible by the order of the element aj, 
and therefore also by the number p^‘, whence it follows that 
divides L: 


Ij = p''-‘my 


1 , 2 , 


i — 1 


Let 


( 10 ) 


z = mifli -f- m.a., . . . -f- 

This will be an element of tlie subgroup Q and therefore of the sub- 
group />' too; by (9) and (10). 

y - P'~^z (II) 

I’rom (8) and (11) follows the equality 

p'‘~* {x — 2 ) “ 0 

•That is, the order of the clement 


t = X — z 
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does not exceed and, hence, by the induction hypothesis, t is 
contained in the subgroup P' . Therefore, element x as the sum of 
two elements oi P\ x — s + /, also belongs to the subgroup P*. 
This is proof that all elements of order of the group P are con- 
tained in P' . 

Consequently, our inductive proof admits of the assertion that 
all elements of the group P enter into the subgroup P' , or P* — P. 
This concludes the proof of the fundamental theorem. 

Collaterally, we have that a finite Abelian group is primary rela- 
tive to a prime number p if and only if its order is a power of p. True 
enough, it was shown that any finite primary (with respect to p) 
Abelian group P can be decomposed into the direct sum of primary 
(with respect to p) cyclic groups, and for this reason the order of the 
group P is equal to the product of the orders of these cyclic groups, 
that is to say, it is a power of p. Conversely, if a finite Abelian group 
has order p^. where p is prime, then the order of any one of its ele- 
ments is a divisor of this number, that is, it is also some power of p. 
and therefore the group turns out to be primary relative to p. 

The fundamental theorem does not yet exhaust the problem of 
a complete description of finite Abelian groups, since we have not 
precluded the possibility that the direct sums of two distinct sets 
of cyclic groups that are primary relative to certain prime numbers 
may prove to be isomorphic groups. Actually, this does not occur, 
as the following theorem shows. 

If a finite Abelian group G is decomposed in two ways into a direct 
sum of primary cyclic subgroups. 

G - {a,} 4- {a,) -4 . . . -f- {a,} = -f- {6,} + . . . -f {6,} (12) 

then both direct decompositions hare one and the .same number of direct 
summands, s — t. and it is possible to establish a one-to-one corres- 
pondence between these decompositions such that the appropriate sum- 
mands are cyclic groups of the same order, which is to say they are iso- 
morphic. 

Note, to begin with, tliat if. say, in llie first of the direct decom- 
positions (12), we colled direct summands relative to a given prime 
p, llieri their direct sura will he a primary (relative to p) subgroup 
of the group G and even a primary component of this group, since 
its order is equal to tlie highest power of p that divides the order 
ol the group G. Thus combining the direct summands in each of the 
decompositions (12), in both ca.ses we obtain a decomposition of G 
into primary components, the uniqueness of which decomposition 
has already been noted above. 

'I'his permits proving our theorem under the assumption that 
llie group G is itself primary relative to the prime number p. Let the 
numbering of the direct summands in each of the decompositions 
(12) be chosen so that the orders of these summands do not increase, 
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that is, the elements a,, a., have, respectively, tiie orders 


for 




A | > . . . > ks 

while the elements b. b, have the orders 

for 

If the assertion of our theorem were not valid, then there would 
be an i, i 1, such that 

Aj - li, . . A',_i = /i_j (Id) 


hut 


ki 


Naturally, i ^ min (s. t), since for each of the decompositions (12) 
the product of the orders of all direct summands is equal to the order 
of the group G. We will show that our assumption leads to a contra- 
diction. 

For example, let 

ki Cli ( 14 ) 

Denote by H the set of elements of the group G whoj^e orders do not 
exceed p*‘*. This is a subgroup of the group G. since if r and y are 
elements of H, then both x y and — x have orders that do not 
exceed the numbers p^*. 

Note that the subgroup // contains, for instance, the following 
elements: 

— kz-h, h: ,—k: 

p > <a,, p ^a., p « « n,-, 0 ^+,, . . 

On the other hand, if 1 ^ ^ — 1, then the element p^r ‘a- 

has order p*i+> and therefore is not in //. From this it follows that 
the co.set aj + H (recall that we are using the additive notation!) 
has, as an element of the factor group G'lJ, the order Such 

also is the order of its cyclic subgroup {aj -f //}. Wo will now i)rove 
that the group GIH is the direct sum of the cyclic suberouns (at -i- 

+ //}, / - 1. 2 I - 1. ‘ ^ 

GIH = (a, -f //} H {og \ H) A ... ~\ . //) ( 15 ) 

«nd .so its order is equal to the number 

If X is an arbitrary element of the group G, then tliere exists the 
notation 

= -f • • . -1 


X 
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Suppose for / = 1, 2, . . I — 1, 

mj — -\-nj 

where 


Then 


OKnjCp'^j 


mjaj = qj{p^i + 


(17) 


and since the first summand of the right member is contained in Hj 
it follows that 


rrijOj H — njUj + H 

On the other hand, 

H = H, . . m^Qs -{■ H ~ H 

And so 

X a = (miflt + 7/) + (Wafla -f -r . . . -h + H) 

= (/iifli -f //) -r + /O + • • • + -r H) 

Let there also he the notation 

X -r // - (h;^, f H) -f- {n^a. -r H) ^ -f- H) 

where 

1, 2, !-l 

Then the elements 


(18) 

(19) 

( 20 ) 


tlnttn . -J- 

and 

7i\ai -r '4^2 ■:• • • • -f" 

lie in one coset relative to //. i.e.. their difference belongs to H and 
therefore 

p'^ l{ni — «!) «!-!-{»■. — «;) Gj-r . ■ - -f- («i-i — «i-i) fli-,] = 0 

From thi.s it follows [since the first of the decompositions (12) is 
direct) that 

= / = !, 2, i=\ 

and so the numher [nj — nj) must be divisible by the order p^^ 
of the element aj and. hence, tlio difference nj — n'j is divisible by 
the number Whence, by (17) and (20), it follows that 

iij - - 1, 2, . . ., I _ 1 

which means that tlie notations (IS) and (19) are identical. This 
proves the e.vislence of the direct decomposition (15). 
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Analogous arguments relative to the second of the direct decom- 
positions (12) will show that this same factor group GIH has the 
direct decomposition 

GIH = H) + {b^^-H} + . . . (bi., + //} + + //} -f . . . 

That is, by (13) and (14), its order must be strictly greater than the 
number (16). This contradiction proves the theorem. 

We have thus obtained a complete survey of the linite Abelian 
groups. Namely, we take all possible finite sets o/ the natural numbers 

(^J» ^2> • • •> ^k) 

different from unity, but not necessarily distinct’, each one of these num- 
bers must be a power of some prime number. To each such set we asso- 
ciate the direct sum of cyclic groups whose orders are numbers from 
this set. All the finite Abelian groups thus obtained are pairwise noni- 
somorphic, and any other finite Abelian group is isomorphic to one 
of these groups. 
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Caucliy, A. L. 13 
Cayley, A. 13 

Cayley-IIamilton theorem 380, 381 
Cayley numbers 111 
Ch’ang Ts’ang 12 
Change*of-basis matrix 185, 186 
Characteristic 
of a field 270 
finite 271 

Characteristic determinants 78 
Characteristic matrix 109 
Characteristic polynomial of a matrix 
200 

Characteristic roots lOOff, 210 
Characteristic zero 270 
Chebotarev, N.G. 14, 414, 415 
Chevalley, Claude 415 
Ch’in Chiu-shao 12 
Ching ChuU'clian 12 
Circle, closed 150 
Class(os) 
addition of 268 
multiplication of 268 
opposite 294 
product of 293, 301 
sum of 293, 300 
unit 294, 301 
zero 294, 300 
Closed circle 150 
Coefficient 
binomial 121 
loading 126 
Cofaclors 43ff 
Collar, A.R. 414 
Common divisor 133 
Commutative field 302 
Commutative group 383 
Commutative ring 2G7 


Commutativity 
of addition 108 
of multiplication 109 
Complementary minors 43 
Complex linear spaces 181, 202, 209 
Complex numbers (see also algebra of 
complex numbers) 107, 110, 112ff 
raising to a power 120 
taking roots of 120, 122, 123 
taking the square root of 122 
Complex plane 112 
Component(s) 

of an element 400 
primary 407 

Congruent modulo n 268 
Conjugate {the conjugate of a) 118 
Conjugate complex numbers 118 
Conjugate elements 395 
Conjugate numbers 119 
Consistent system of linear equations 
16 

Constructive definition 103 
Continuous function 144 
Continuous groups, theory of 11, 13 
Correspondence, isomorphic 181 
Coset, left 392 
Countable set 352 
Cramer, G. 12 
Cramer’s rule 24, 53fi, 56, 57 
new derivation of 97 
Criterion, Eisenstein 344, 345 
Cubic equations 226 
incomplete 226 
with real coefficients 228 
Cubic form 306 
Cubic polvnomial 127 
Cycle(s) 34 
of degree « 35 
disjoint 35 
Cycle length 35 
Cyclic groups 390n 
finite 391 
infinite 390, 301 
primary 405, 407 
Cyclic permutation 34 
Cyclic subgroups 389 
Cyclotomic polynomial 345 


d’Alembert, J.R. 12 
d’Alembert’s lemma 147, 149 
Decomposability of a finite Abelian 
group 407 

Decomposable quadratic forms 172 
Decomposition 34 
into cycles 34 
direct 400 
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of a group 391ff, 407 
left 392 

of polvnomials 284 
right 392 
summands of 400 

unique (of a proper rational frac- 
tion) 159 
example of 160 

Decrement of a permutation 26 
Dedekind, R. 13 
Del'initcness of a form, positive 177 
Definition 
axiomatic 103 
cunsiriictive 103 
Degree 

of a X-raatrix 365 

of a polynomial 303 

of a term 303 

De Moivre’s formula 120 

Denumerable set 352 

Depetulence of vectors, linear 62ff 

Dei'iviilive of a polynomial 141 

second lil 

Descartes. H. 12 

Descartes’ rule of signs 247 

Descartes’ theorem 247, 249 

Determiiiant(s) 23 

characteristic 78 

delitiilion of 23 

evaluating 4611 

expansion <if 47 

mu[li|ilicati(]n tlioorem for 93 

(if nth order 36|'f 

of second and third order 2211 

second-oi'dt'r 23 

skew-sYiniuetric '12 

of a system '>1 

theory of 12 
% 

axiomatic coiislruction of 103f 
tfiird-order L’.'*. 37 
\’andermoiid(‘ 49, 3251, 336 
Determinate system 16 
Ui.igurial, principal 16 
Diagonal matrices 371 
Diagonalizalion of a matrix 203 
Differmice 2til 
Differentia] algehr.i II 
Differentiating a sum and a product, 
foriQulas for 1 12 
Dimension of a s|)ace 185 
Diophantos of Alexandria 12 
Direct docomposilion 400 
Direct product 406 
Direct sum 400, 403 
Discriminant 326, 334 
of an equation 228 
of a quadratic equation 335 


Disjoint cycles 35 
Dividend {of a poljmomial) 306 
Divisible 131 
exactly 131 

Divisibility of polvnomials 131-133, 
305 

Division in a field, uniqueness of 267 
Division algorithm 129, 131 
Divisorfs) 131f 
common 133 

elementary (of a matrix) 376 
elementary (of a polynomial) 376 
greatest common 13if, 133 
of integers 133 
of polynomials 133, 135, 138 
normal 394f, 398 
of a polynomial 306 
of unity 285 
Duncan, W.J. 414 


Eigenvalues 199f 
Eigenvector 209 
Eilenberg, S. 414 
Eisenstein criterion 344-345 
Elemetit(s) 
component of 4U0 
conjugate 395 
identity (of a group) 385 
of infinite order 389 
inverse 179. 384 
multiples of 389 
opposite 179 
power of 389 
prime (of a ring) 2S5 
of a set 261 
unit 269, 383, 385 
zero ISO, 26 5 
Elementary algebra 7 
Elementary divisors of a matrix 376 
Elementary divisors of a polvmmiial 
376 

Elementary matrix 363 
Elemenlarv svmmetric polvnomials 
313 

Elementary transformations 74 
of a matrix 355 

Elimination of unknown 326. 331 
Equalizing coefficients, method of 23 
E(iualion(s) 
cubic 226 
incomplete 220 
with real coefficients 228 
general theory of 12 
higher-degree 231 

homogeneous linear (see systems of 
h.l. eqs.) 
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nonbomogeneous (see system of n. 
eqs.) 

nth-degree 232 
quadratic 225 
quartic 230 
quintic 232 

of second, third, fourth degree 2250 
solvability of by radicals 12 
systems of linear (see systems of 1. 
eqs.) 

Equivalence of X-inatriccs 3550 
Equivalence relation 356 
Euclidean algorithm (see Euclid’s a.) 
Euclidean spacc(s) 204 
isomorphic 2U8 
isomorphism of 2080 
n-dirnensional 204, 205 
Euclid’s algorithm 133, 136, 241 
E.xpansion of a determinant 47 
Extensions 2710 


I'actor(s) 
double 284 
invariant 361 
Ar-fold 284 
multiple 284, 287 
isolation of 288 
simple 284 
single 284 
triple 284 

Factor groups 394, 395, 390 
examjiles of .397 

Factorization of polynomials 284 
into irreducible factors 281f 
Faddeycv, D.K. 414 
Faddcycva, V.N. 414 
False position, rnetliod of 251 
Ferrari. L. 12, 230 
Ferro, S. del 42 

Fibonacci, L. (see Leonardo of Pisa) 12 
Field(s) 9, 2C7f 

of algebraic functions, theory of 
10. 13 

of algebraic numbers, theory of 10. 
13 

characteristic of 270 
commutative 302 
of complex numbers 
construction of 273, 275, 295 
uniqueness of 272 
concept of 257 
definition of 267 
division in, uniqueness of 267 
finite 268 


general theory of 13 
number 257, 259. 271 

^**2-9**^'^ adjoining an element 

of rational fractions 2970 
of rational numbers 260, 341 
splitting 296 

Finite Abelian groups 4060 
fundamental theorem on 406 
Finite characteristic 271 
Finite cyclic group 391 
Finite fields 2t)8 
Finite group 383 
Finite rings 268 

Finite-dimensional linear space 183 
Finite-dimensional spaces 182 

Finite-dimensional unitarv spaces *’10 
Form(s) 

cubic 300 
of degree s 306 
Jordan normal 370f 
reduction of a matrix to 37.') 
linear 62. 306 
negative definite 177 
normal 169, 170 
of a matrix 3r)50 
pairs of 219, 223 
positive definite 1740 
quadratic (see also quadratic form) 

auartic 306 
theory of 8 

trigonometric (of complex number) 

1 1 4 

Formula (s) 

Cardan’s 227, 229 
De iMoivre’s 120 

for di0erentialliig a sum and a imi- 
duct 142 ' 

Lagrange intorpolaljon 153 
Newton’s 323 
Taylor's 14.5 
Vieta’s 1.54. 296, 313 

Fourier (see Budan-Fourier Tbeoreiii) 
Fractioii(s) 

partial 1.57 
rational 1.56, 298 
field of 2970 
in lowest terms 156 
proper 1.56 
simplified 1.56 
symmetric 321 
synimelrir rational .321 
Fractional rational functions 156 
I-ra/er, B.A. 414 
Fre<- unknowns 79 
Frobonius, F.G. 13 
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Function(s) 
continuous 144 
fractional rational 156 
rational integral 337 
symmetric 312 
Functional analysis 9, 10 
Fundamental system of solutions 84 
Fundamental theorem 

of the algebra of complex numbers 
143R 

alternative proof of 337f 
corollaries to 151ff 
on finite Abelian groups 406 
of higher algebra 143 
on the similarity of matrices 367 
on svmmelric polynomials 314, 316, 
319 


Galois, E. 9, 12, 13, 232 
theory of II, 13 
Gantmacher, F.R. 414 
Gauss, C.F. 12, 143 
Gauss’ (or Gaussian) elimination pro- 
cess 21 

Gauss' {or Gaussian) lemma 307, 342 
Gauss’ (or Gaussian) method 17, 18, 
20 

Gelfand, I.M. 414 

Gerier.ite (verb) {suhgrou[) generated 
by suhiinuips) 4ol 
Generation of a linear suhspace 196 
Generator 390 
Geometry 

algebraic 0, 326 
projective 11 
Gracffe method 236 
Grassmann. 11. 13 
Grave. D.A. 13. 415 
Greatest coiniiKin divisor 131f, 133 
of integers 133 
of pnlvnomials liKi, 13.5, 138 
Group(.s) 1(1, 382ff 
Abelian 3S3. 3S5-3S7 
linite 4nt)ff 
indecoinpiisahlo 'i<t5 
primary 4ii7 
addition in 38.5 
additive 385 
alternating 387 
commutative 383 
continuous 11. 13 
theory of 13 
cyclic 390ff 
primary 407 
decomposition of 391 ff 
definition of 382, 383 


factor 394, 395, 396 
examples 397 
finite 383 

finite Abelian 406ff 
complete survey of 413 
fundamental theorem on 406 
finite cyclic 391 
general theory of 13 
infinite cyclic 390, 391 
isomorphic 385 
Lie 11 

multiplication in 382 
multiplicative 386, 391 
noncommutative 387 
order of 383 
primary 407 
primary Abelian 407 
primary cyclic 405, 407 
theory' of 10, 382 
Soviet school of 14 
Gurevich, G.b. 414 


Hamilton, W.R. 13 
Hamilton (see Cayley-Hamilton theo- 
rem) 

Hccke, E. 415 
Height of a polynomial 353 
Higher algebra 7, 8 
Higher-degree equations 231 
Highest term of a polynomial 311 
‘ilisub al-jabr w’al-mugu-halah” 12 
Hodge. W. V.D. 415 
Holder, O. 13 

Homogeneous linear equations (see 
systems of li.l. cqs.) 
Homogeneous polynomial 306 
Homological algebra 13 
lUuuomorphic mapping 307 
Homoinorpliism(s) 394, 397 
canonical 398 
theorem on 398 
Horner metliod 140, 141 
Hurwilz. A. 13 

Hypercomplex numbers, theory of 10 
Hyporcomplex systems, theory of 13 


Ideals, theory of 10, 13 

Identity element of a group 383, 385 

Identity matrix 93 

Identity permutation 31 

Identity transformation 189, 195, 214 

Image 188 
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Imaginaries 
axis of 112 
pure 112 

Imaginary part 112 
Imaginary unit 112 
Incomplete cubic equation 226 
Inconsistent system of linear equa- 
tions 16 

Indecomposability of groups 405_ 
Indecomposable Abelian group 405 
Indefinite quadratic forms 177 
Indeterminate system 16 
Index 
of inertia 
negative 172 
positive 172 
of a subgroup 393 
Inertia 

law of 169f, 170 
negative index of 172 
positive index of 172 
Infinite cyclic group 390, 391 
Infinite-dimensional linear spaces 181 
Infinite-dimensional spaces 9 
Integers, system of 107 
Integral rational functions 156 
Interpolation, linear, method of 251 
Interpolation formula, Lagrange 153 
Invariant (adj.) 211 
Invariant factors 301 
Invariant subgroup 394 
Invariants, theory of 9 
Inverse (to a class) 301 
Inverse of a permutation 33 
Inverse element 179, 384 
Inverse linear transformation 199 
Inverse matrices 93 
Inverse matrix 
left 94 
right 94 

Inverse operation 261 
Inverse polynomial 129 
Inverse transformation 199 
Inversion 29 
Irrational numbers 107 
Irreducible (of a polynomial) 281, 306 
Irreducible (of a solution) 230 
Isomorphic (adj.) 272 
Isomorphic correspondence 182 
Isomorphic Euclidean spaces 208 
Isomorphic groups 385 
Isomorphic real linear spaces 181 
Isomorpliism(s) 178, 181 
of Euclidean spaces 208n 
of fields 272ff 
of rings 272fl 
Iterative procedures 58 


Jacobson, N. 414. 415 
Jordan, M.E.C. 13 
Jordan matrices 370 
Jordan matrix of order n 370 
Jordan normal form 370f 
reduction of a matrix to 375 
Jordan submatrix 371 


Kernel 

of a homomorphism 398 
of a linear transformation 197 
Khayyam. Omar, 12 
Kronecker, L. 13, 345 
Kronecker-Capclli theorem 77, 78, 81 
Kummer, E.E. 13 
Kurosh, A.G. 415 


Lagrange, J.L. 12, 13 

Lagrange interpolation formula 153 

Lagrange’s theorem 393 

Laplace, P.S. 12 

Laplace’s theorem 50, 51 

Lattice 11 

Lattice theory 11, 13 
Law of inertia 169f. 170 
Leading coefficient 126 
Left coset 392 
Loft decomposition 392 
Left-identity 384 
Left-inverse 385 
Left inverse matrix 94 
Lemma (see theorem) 
d'Alembert’s 147, 149 
Gauss’ (or Gaussian) 307, 342 
on the increase of the modulus of 
a polynomial 146 

on the modulus of the higbest-degree 
term 145 

Leonardo of Pisa (see Fibonacci) 12 
Lie. S. 13 

Lie groups, theory of 11 
Linear algebra 7, 8, 13, 15, 276 
Linear combination of vectors 62 
Linear dependence of vectors 62ff 
Linear equations (see systems of I. 
eq.s.) 

Linear form 02, 306 
Linear interpolation, method of 251 
Linear polynomials 127, 139 
Linear spaces 7, 178fl 
complex 2u2, 209 
liriite-dimensional 183 
inlinite-diinonsional 181 
n-dimensional 185 
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Linear subspaces 195fl, 202 
generation of 196 
Linear substitution 87 
Linear transformation(s) 87, 89, I88f 
inverse 199 
kernel of 197 
nonsingular 93, 198 
nonsingularity of 224 
null space of* 197 
operations on 193 
product of 193 
by a scalar 193 
rank of 197 

with a simple spectrum 202 
singular 93 
spectrum of 200 

Linearly dependent system of vectors 
63, 64 

Linearly independent system of vec- 
tors 63 

Lobachevsky, X.l. 13 
method of 2'*6 
Lyapin, E.S. 414, 410 


Maltsev, A. I. 414 
Mapping, liomoijKirphic 397 
Matrices (see also matrix) 
diagonal 371 

fundamental tlieorera on tlie simi- 
larity (d 367 
inverse 93 ff 
.Iordan .'S/i) 

X-matrices 3.').‘i 
canonical 3.M;. 3.', 7 
eqnivaleure .’t.'i.'.ll 
equivalent 3.a6 
unimodular 3(i2ff 

of a linear transformation in diffe- 



noncoiumiil.itive 9 m 
immerifal iv'i.'t 
orthogiuia! 21ntf, 211 
polyiKunial H,'),’, 
product Ilf 12'i 
rectangular 
multiplicati'iii of u; 
scalar I(i2 
similar 192. 2no 

similarity of, fiiiidanieiii;il tlicuiem 
on 367 

square, similar l'.i2 
theory of 8 

Matrix (see also matrices) k; 
adjoint of 94 


augmented 21 
change-of-basis 186 
characteristic 199 
definition of 23 
diagonalization of 203 
elementary 363 
elementary divisors of 376 
elementary transformations of 355 
identity 93 

Jordan (of order n) 370 
left-inverse 94 

multiplication of by a scalar 99, 100 
normal form of 355fT 
of a quadratic form 162 
reduction of to diagonal form 75, 
203 

reduction of to Jordan normal form 
375 

right-inverse 94 
square 93 
nonsingular 93 
of order « 16 
singular 93 

transformations of, elementary 355 
unit 16, 93, 195, 211 
zero luO. 195 
Matrix addition 99 
Matrix multiplication 8711, 89 
Matrix polynomials 365f 
Matrix root of a polynomial 378 
Maximal linearly independent system 
of vectors 05, 68 
Method 

alphabetical 310 
of equalizing coefficients 23 
of false position 251 
Gracffe 256 
Horner 140, 141 

iterative (.see iterative procedure) 
of linear interpolation 251 
of Lobachevsky 256 
Xewton’.s 2.36,* 252, 253 
Slunn’s 238 

Minimal polynomials 377ff 
.Min()r(s) 4317 
complementary 43 
Ath-order (of a matrix) 70 
of Ollier A- 43 
priiiciiial (i.f a form) 175 
M<‘dulu.s 1 13 

of a product of complex numbers 115 
of a qui'tit'nt of two complex num- 
bers 116 
of a sum 117 

Molin. F.E. 13 

Multiilimensional space 7 
Multidimensional vector spaces 59 
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Multiplication 261 
of classes 268 
in a group 382 
matrix, associativity of 96 
of a matrix by a scalar 99, 100 
noncommutativity of 90 
of rectangular matrices 97 
scalar 204 

of vectors by a scalar 61 
Multiplication theorem for determi- 
nants 91, 93 

Multiplicative group 386, 391 
Multiplo{3) 
of an element 389 
zero 265 

Multiple factors 284, 287 
isolation of 288 
Multiple roots 141 
Multiplicity of a root 141, 152 
Murnagban, F.D. 415 


Negative definite forms 177 
Negative index of inertia 172 
Newton, Isaac 12 
Newton’s binomial theorem 120 
Newton’s formulas 323 
Newton’s method 236, 252, 253 
Noether, E. 9 
Noether, M. 13 
Nonassociative rings 267 
Noncommutative groujis 387 
Noncommutalive matrices 90 
Noncommutative ring 266 
Noncommutativity of multiplication 
90 

Noncommutablo set 352 
Nonhomogeneous equations 83 
Nonhomogcne*)US system 83 
Nonsingular linear transformations 93, 
108 

Nonsingular quadratic form 162 
Nonsingular square matrix 93 
Nonsingular transformation 211 
Nonsingularity of a linear transfor- 
mation 224 

Norm of a number 280 
Normal divisors 394f, 398 
Normal form 169, 170 
of a matrix 3r)5fl 
Normalization of a vector 208 
Normalized vector 207 
Notation, additive 400 
Null space of a linear transformation 
197 

Nullity of a tr.insformalion 197, 198 


Numher(s) 
algebraic 349f 
conjugate 350 
set of 350 
Cayley 111 

complex 107, 110, 11211 
raising to a power 120 
taking roots of 120ff, 122, 123 
taking the square root of 122 
conjugate 118 
conjugate complex 118 
h>'percomplex 10 
irrational lo5 
rational 105 
held of 341 
real 105 

transcendental 349, 353, 354 
Number fields 267, 260, 271 
Number rings 257, 258, 259 
Numerical matrices 355 


Okunev, L.Ya. 414, 415 
Omar Khayyam 12 
Operation 
algebraic 261 
inverse 261 
Opposite class 294 
Opposite element 179 
Order of a group 382 
Orthogonal bases 207 
Orlliogonal matrices 210ff, 214 
Orthogonal system {of vectors) 206 
Orthogonal transformation(s) 2l0ff 
of Euclidean space 212 
Orlhogonalization process 2o0, 207 
Orthoiiormal bases 204, 2o8 
Orthonormal basis 208 


Parity of permutations 34 
Part 

imaginary 112 
real 112 

Partial fraction 167 
Pedoe, D. 415 
Permutation(s) 27n 
cyclic 34 
decrement of 36 
definition of 28 
of degree n (dolinilion) 30, 32 
even 32 
identity 31 
inverse of 33 
multiplication of 32 
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odd 32 
parity of 34 

Pi (n), tranijcendence of 259 
Plane, complex 112 
Polar angle 113 
Polynomial(3) 156 
algebra of over an arbitrarj' field 276 
algebraic viewpoint of 127 
alphabetic order of terms of 310, 311 
is annihilated by a linear transfor- 
mation 381 

characteristic (of a matrix) 200 
cubic 127 
cyclotomic 345 
decomposition of 284 
definition of 127 
degree of 303 
of degree n 127 
of degree one 139 
of degree zero 127, 129 
derivative of 141 
dividend of 3u0 
divisibility of 131-133, 305 
divisor of 306 

elementary divisors of a 376 
equal 127, 303 
evaluating roots of 22511 
factorization of 284 
into irreducible factors 2Slf 
first-degree 127 

as a formal algebraic expression 127 
function-theoretic viewpoint of 127 
greatest common divisor of 134 
liigliest term of 311 
homogeneous 3u6 
identically equal 127, 303 
integral, ratiojial roots of 345ff 
inverse 129 
irredncihlo 2S1, 306 
linear 127. 139 
matrix 36.'if 
minimal .37711 
ntfi-degree 127 
operations on I36ff 
primitive .3‘'i2 
quadratic 127 
quotient of 131 
with rational coefficients 3410 
with real coefficients 155 
rcducibility of over the field of 
rationals 341 ff 
reducible 281, 306 
relatively prime 133 
theorems on 137 
remainder in division of 131 
ring of 279, 304 
roots of 1390 


in several unknowns 3030 
sum of 304 
symbols for 127 
symmetric 3120, 3190 
elementary 313 

fundamental theorem on 314, 316. 
319 

in two systems of unknowns 324 
value of 139, 377, 381 
from viewpoint of mathematical 
analysis 127 

Polynomial matrices 355 
Pontryagin, L.S. 415 
Position, false, method of 251 
Positive definite forms 1740 
Positive definite quadratic forms 1740 
Positive definiteness of a form 177 
Positive index of inertia 172 
Postraultiplication 94, 99 
Power 

of an element 389 
raising complex numbers to a 120 
zero 389 
Power sums 322 
Premultiplication 99 
Primary Abelian group 407 
Primary components 407 
Primary cyclic groups 405, 407 
Primary group (subgroup) 407 
Prime element of a ring 285 
Primitive nth roots of unity 125 
Primitive polynomial 342 
Primitive root 391 
Principal-axis theorem 219, 220 
Principal diagonal 16 
Principal minors of a form 175 
Product 

of classes 294, 301 
direct 406 
of matrices 89 
scalar (of vectors) 205 
Projective geometry 11 
Proper rational fraction 156 
Proskuryakov, l.V. 414 
Pure imaginaries 112 


Quadratic equations 225 
Quadratic form(s) 306 
canonical 164 
complex 162 
decomposable 172 
definition of 162 
indefinite 177 
matrix of 162 
negative definite 177 
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nonsingrular 162 
positive definite 174ff 
rank of 162 
real 162 

reduction of to canonical form 1610 
theory of 168 

reduction of to principal axes 168. 
219n 

semidefinite 177 
theory of 161 
Quadratic polynomial 127 
Quadric curves and surfaces, theory 
of 161 

Quartic equations 230 
Quartic form 306 
Quasigroups, theory of 13 
Quaternions 111 
Quinlic equations 232 
Quotient 267 
of a polynomial 131 


Radius vector 113 

Range of values (of a transformation) 
197 
Rank 

of a linear transform.ation 197 
of a matrix 69ff 
evaluating 72 

of a product of matrices 98 
of a quadratic form 162 
of a system of vectors 68 
Rank theorem 72, 74 
Rational fractions ir)6f, 298 
field of 297ff 
in lowest terms 156 
proper 156 
simplified 156 
Rational numbers 107 
field of 341 

Rational roots of integral polynomials 
34511 


Real linear spaces 178 
Real numbers 107 
Real part 112 
Reals, axis of 112 
Rectangular matrices 70 
multiplication of 97 
Reduced system 86 
Reducibility of polynomials over the 
field of rationals 341 ff 
Reducible (of a polynomial) 281, 306 
Reduction 

of a matrix to diagonal form 203 
of a matrix to Jordan normal form 
375 


of quadratic forms to canonical 
form 161ff 

of quadratic forms to principal axes 
168, 2l9ff 
Regula falsi 251 
Relation, equivalence 356 
Relatively prime polynomials 133 
theorems on 137 

Relatively prime system of polvno- 
mials 138 

Remainder of polynomials (in division) 
1 31 

Resultant 326, 327, 330 
Right decomposition 392 
Riglit-identity 384 
Right-inverse 384 
Right inverse matrix 94 
Ring(s) 10. 26ufl 
commutative 267 
concept of 257 
definition of 262 
examples of 262 
finite 268 
of functions 262 
nonassociative 267 
noncomniutativo 266 
number 257, 258, 259 
of polynomials 279, 304 
theory of 10, 13 
Root(s) 

approximation of 250£f 

bounds of 232n 

characteristic 199fl. 216 

of complex numbers 12UfI. 122. 123 

*-fold 141 

matrix 378 

multiple 141 

of polynomials 1390, 378 
primitive 391 

rational (of integral polynomials) 
3450 

simple 141 

theorem on the existence of a 290f 
theorems on the number of real 244f 
of unity 1240 
primitive nth 125 
Riiffini, P. 12 


Scalar matrices 102 
Scalar multiplication 201 
Scalar product of vectors 204 
Schmidt, O.Yu. 14, 415 
Schreier. 0. 414 
Self-adjoint transformation 215 
Semidefinite quadratic forms 177 
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Semigroups, theory of 13 
Sequence, Sturm’s 239 
Set 

countable 352 
denumerable 352 
noncountable 352 
Shapiro, G.M. 414 
Sbatunovsky, S.O. 13 
Shilov. G.E. 414 
Signature of a form 172 
Similar matrices 192, 200 
Similar square matrices 192 
Similarity of matrices, fundamental 
theorem on 367 
Simple factor 284 
Simple root 141 
Simple spectrum 202, 203 
Simplified rational fraction 156 
Single factor 284 
Singular linear transformation 93 
Singular square matrix 93 
Skew-symmetric determinant 42 
Solvabililv of equations by radicals 
12 

Sominsky, I.S. 414 
Spaco(s) 

complex linear 181, 202 
Euclidean (see also Euclidean spa- 
ce) 204 

finite-dimensional 182 
foiir-diruonsional 7 
of functions 185 
infinite-dimensional 9 
linear 7, 178f( 
liiiite-dimensional 183 
intinite-iliinoTKsional 181 
n-dimensional 185 
niiiltidiniensional 7 
mill lii7 
real afline 178 
nal linear 178. I8I 
isomorphic l8l 
ri'al vector 178 
of s'ljuences 185 
unitary 209 

liiiite-dimensional 210 
vector (see also vector spaces) 7 
theory of 9 
Spectrum 

of a linear transformation 200 
simple 202, 203 
Sperner, E. 414 
Splitting field 416 
Square matrix 93 
Sturm method 238 
Sturm sequence 239 
Sturm theorem 238fl 


Subfields 271ff 
Subgroup(s) 388ff 
cyclic 389 

generated by subgroups 401 
invariant 394 
primary 407 
unit 389 

Suhmalrix, Jordan 371 
Subspace(s) 
linear 195ff, 202 
generation of 196 
zero 195 

Substitution, linear 87 
Subtraction 261 

Successive elimination of unknowns, 
method of 15, 17 
Sum(s) 

of classes 293, 300 
direct 400, 403 
of polynomials 304 
power 322 

Summands of a decomposition 100 
Sushkevich, A.K. 414, 415 
Sylow, 13 
Sylvester, J.J. 13 
Symmetric functions 312 
SjTnmetric polynomial in two systems 
of unknowns 324 

Symmetric polynomials 312ff, 3192 
elementary 313 

fundamental theorem of 314, 316, 
319 

Symmetric rational fractions 321 
Syminolric transformations 215f 
SysU'm(s) 
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